Page MenuHome

Algolia → ElasticSearch

Authored by Stephan preeker (Preeker) on Dec 8 2017, 5:08 PM.

Diff Detail

rPS Pillar

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
27 ↗(On Diff #9700)

this should be logged at debug level, if at all.

82 ↗(On Diff #9700)

use an explicit class instead of a dict; see see comments about prepare_user_data(); they apply here too.

98 ↗(On Diff #9700)

just use if node.get('description'):

136 ↗(On Diff #9700)

remove newline

142 ↗(On Diff #9700)

remove newline

143 ↗(On Diff #9700)

this is not true -- the user hasn't been pushed yet at this point in time.

159 ↗(On Diff #9700)

This isn't called by magic "when a user item is updated".

Also, the function name should reflect what the function does, not necessarily when it should be called. This is also inconsistent with node_save(), which is apparently named with a different naming strategy.

184 ↗(On Diff #9700)

Algolia is used even when SEARCH_BACKENDS does not include it.

12 ↗(On Diff #9700)

Just name it manager_elastic to be consistent with other manager_xxx living in module xxx.

13 ↗(On Diff #9700)

Usage information shouldn't include function calls.

26 ↗(On Diff #9700)

Be more explicit in what parameters the function accepts, so that it can be understood from its --help message (rather than from reading the source code).

47 ↗(On Diff #9700)

Log *what* is indexed; "%d users" is more descriptive than "%d".

65 ↗(On Diff #9700)

Either refactor into common code, or just leave as-is but remove the "stolen from..." comment.

69 ↗(On Diff #9700)

This is not done; stealing is accepable, but be careful not to copy too much.

79 ↗(On Diff #9700)

pipeline is unused but _public_project_ids() is called twice.

84 ↗(On Diff #9700)

Just use in-line dict here, no need to use a separate variable.

84 ↗(On Diff #9700)

Here you fetch all nodes in public projects, but later in prepare_node_data() you also discard non-published nodes. Just filter them here too, so that you don't have to do any processing on them.

87 ↗(On Diff #9700)

see comment about users

96 ↗(On Diff #9700)

except is not a function call; space after it.

Also, what exactly would cause a KeyError or AttributeError? Why choose to handle those here, and not in prepare_node_data or index_node_save?

112 ↗(On Diff #9700)

Be explicit when an unknown index name is given. Make sure that --help shows which index names can be used.

81 ↗(On Diff #9700)

This should be a set and not a list, since order doesn't matter and duplicates are not allowed.

83 ↗(On Diff #9700)

Add comments to each configuration option to explain what it does.

11 ↗(On Diff #9700)

Delete unused code.

13 ↗(On Diff #9700)

what is too ambiguous. "What I'm looking for" (e.g. the goal of the search) is something radically different than "what my search query is" (e.g. my attempt at reaching that goal), and maybe this even means something else.

23 ↗(On Diff #9700)

Remove unnecessary variables. Just use elasticSearcher. Also, add a comment that explains where this variable comes from.

25 ↗(On Diff #9700)

remove unused code.

33 ↗(On Diff #9700)

See earlier note: what you are looking for is not the same as the search query (one is well-defined, the other is not).

34 ↗(On Diff #9700)

This immediately overwrites the default value, so just set the default to this.

40 ↗(On Diff #9700)

avoid unnecessary braces.

46 ↗(On Diff #9700)

Clean up indentation & code.

49 ↗(On Diff #9700)

Space between ){

140 ↗(On Diff #9700)

facets is not defined at an outer scope, so assigning it here is moot.

144 ↗(On Diff #9700)

This code *really* needs documentation about what it is doing and why it is doing it that way. And some cleanup. Without that I can't review it (as I don't know what your intent is, I can only guess, and that won't make for a good review).

149 ↗(On Diff #9700)

"refined" is a concept that should be documented, or at least a comment that explains what it is should be included.

215 ↗(On Diff #9700)

Just calculate the proper bounds for the for-loop, instead of looping over more than you need.

224 ↗(On Diff #9700)

don't keep commented-out code

234 ↗(On Diff #9700)

Why this strange off-by-one? I would expect prev_page = - 1 and next_page = + 1 (of course with some boundary checks).

263 ↗(On Diff #9700)

keep indentations clean

268 ↗(On Diff #9700)

delete unused code

358 ↗(On Diff #9700)

No need to concatenate, just set var urlParams = '?q=' + … here.

359 ↗(On Diff #9700)

Here again there is a strange base-1 to base-0 translation. If different parts of the same page use different bases for the pagination, this should be very explicit.

362 ↗(On Diff #9700)

Remove unused code

376 ↗(On Diff #9700)

remove newline

3 ↗(On Diff #9700)

Remove algolia reference

6 ↗(On Diff #9700)

First opening parenthesis is not necessary (and neither is the matching closing one). Also remove those from the other function declarations, unless they are really necessary (but then add a comment explaining why).

8 ↗(On Diff #9700)

What is deze? This function follows a very specific design, which should be explained a bit in a comment.

39 ↗(On Diff #9700)

Document which types are supported, and what the callback function will get as parameters.

43 ↗(On Diff #9700)

Be clear in your documentation

44 ↗(On Diff #9700)

Document why the unused data parameter is there, or remove it.

48 ↗(On Diff #9700)

Document what acceptable values are.

56 ↗(On Diff #9700)

Document what "being refined" means.

63 ↗(On Diff #9700)

Write full English sentences, space after //, remove double spaces.

75 ↗(On Diff #9700)

Use .done() and .fail() callbacks. Handle error conditions!

82 ↗(On Diff #9700)

remove unused code

111 ↗(On Diff #9700)

Remove console.log() calls unless end users have a real use for them.

112 ↗(On Diff #9700)

Document what q, cb, and async are. I'm unsure what async is for, as it's called equally synchronously as cb.

122 ↗(On Diff #9700)

don't declare this as a JQuery function, because you never use $('some-selector').getSearch(). As far as I can see, this can just be implemented as part of findMatches

131 ↗(On Diff #9700)

Handle error conditions!

137 ↗(On Diff #9700)

Document why it is necessary to call slice(0) twice.

150 ↗(On Diff #9700)

unused code

9 ↗(On Diff #9700)

fix indentation

25 ↗(On Diff #9700)

Document what this URL-like thing is.

Sybren A. Stüvel (sybren) requested changes to this revision.Dec 13 2017, 4:39 PM
This revision now requires changes to proceed.Dec 13 2017, 4:39 PM

According to 533544117bf3 searching on user ID should work, but it doesn't in /u/ (even after reindexing users).

Stephan preeker (Preeker) marked 41 inline comments as done.Dec 15 2017, 5:52 PM
Stephan preeker (Preeker) added inline comments.
28 ↗(On Diff #9700)

this is a Aesthetics point of view. When developing testing each field/parameter might need to be changed / commented out. configuring/edits and changes are way easier this way.

119 ↗(On Diff #9700)

its called defensive programmming :)

147 ↗(On Diff #9700)

I have no clue. logic was there in algolia as well..


if prefer it there?

80 ↗(On Diff #9700)

unfortinatly using the logger, we lose all white space in the logger makeing query json debugging i resorted to a print :)

70 ↗(On Diff #9700)

this should be checked...

82 ↗(On Diff #9700)

this was "inherited" old code :) it is a nested structure with no methods. I dont think a class thing would help.

96 ↗(On Diff #9700)

data integrity stuff. objects missing expected values.. this especially happend before I applied filtering..We do want to keep indexing and not completely come to an halt. In all other occasions it should crash. so I handle the execeptions here.

112 ↗(On Diff #9700)

couldnt find an exmple to do this..

82 ↗(On Diff #9700)

A class would definitely help, because then you can go to its definition and see its fields, declare variables/parameters to be instances of that class, check with isinstance() if you want to be sure at runtime, etc.

This is especially important when passing from function to function, and even more so when different things are all called "user" or "nodes"

96 ↗(On Diff #9700)

Fair enough; something this would be great to put in the code as a comment.

112 ↗(On Diff #9700)

Then find a solution; an alternative would be to have three separate reindex_{all,users,nodes} functions.

Stephan preeker (Preeker) marked 3 inline comments as done.
  • wip D2950
  • wip D2950
  • Merge branch 'master' of into elastic
  • remove algolia from css and vendor stuff
  • javascript debugging
  • remove dead code
  • make javascript more secure..
Stephan preeker (Preeker) marked 2 inline comments as done.Dec 29 2017, 5:05 PM
31 ↗(On Diff #9741)

Why doesn't this use the connection created in

Sybren A. Stüvel (sybren) requested changes to this revision.Jan 3 2018, 6:21 PM
Sybren A. Stüvel (sybren) added inline comments.
47 ↗(On Diff #9741)

This should not log the same as a NotFoundError, as it's executed when there is *not* an exception.

This revision now requires changes to proceed.Jan 3 2018, 6:21 PM
  • better error reporting
  • improve elastic server settings
  • Merge branch 'master' of into elastic
  • merge
Stephan preeker (Preeker) marked 28 inline comments as done.Jan 5 2018, 11:57 AM
  • minor documentation / annotation fixes

Abandoning this review because we've already moved to ElasticSearch quite a while ago :)