Implementations¶
LDA Model¶
-
class
core.lda_engine.
LdaModelWrapper
(filename, force_load=False, np=True, keep_state=True)[source]¶ Generates the top N relevant topics of an author in our database.
Parameters: - author_id – the author’s ID in our database.
- top – Number of topics to be returned.
Returns: a NumPy array of topics IDs and their confidence levels.
-
get_topic_in_list
(topic_id)[source]¶ Given a topic ID in the model, generates a list of terms.
Parameters: topic_id – The topic’s ID in the model. Returns: A list of terms.
-
get_topic_in_string
(topic_id, top=5)[source]¶ Given a topic ID in the model, generates a string representation of that topic.
Parameters: - topic_id – The topic’s ID in the model.
- top – Top N relevant terms.
Returns: A string representation of the topic.
-
get_topics_in_string
(topics, confidence=False)[source]¶ Converts a list of topics (with or without confidence levels) to a list of strings encoded in a dict.
Parameters: - topics – The list of topics to be converted.
- confidence – If the input topics contains confidence levels, make sure this is set to True.
Returns: a list of dictionary that includes string representations (or with confidence levels)
Matching Algorithm¶
-
core.matching.lda.
detailed_results
(results, model_name)[source]¶ Retrieves matched author (aka reviewers) details.
Parameters: - results – a dictionary generated by match_by_lda with detailed=True
- model_name – the name of the current model
Returns: a dictionary of author details. You’ll see an example output when you run on the demo model.
-
core.matching.lda.
match_by_lda
(text, model_name, top=50, detailed=True, scoring_impl='default', base=0)[source]¶ Gives the best matching result given a string of raw text.
Parameters: - text – The text to be matched.
- model_name – The name of the LDA model to be used.
- top – the maximum number of results to be returned.
- detailed – return a detailed result. It should always be True unless it is used outside the web app.
- scoring_impl – the scoring implementation to be used.
- base – The initial value of the vector.
Returns: the matched result in dictionary form if detailed=True. Otherwise it will return a matrix with author id and the score.
-
core.matching.lda.
score
(paper_vec, author_vec, method)[source]¶ Scores a paper-author match.
Parameters: - paper_vec – the vector to be matched
- author_vec – the vector to be scored against (usually it is the vector of an author)
- method – the name of scoring implementation
Returns: a scalar measuring the score of the match