Garth Wiki
Advertisement


Author

Sebastian Marzjan

sebastianm@wikia-inc.com

Date

Links

Text

Hi Team,


Today I=E2=80=99d like to share a bit of knowledge about Holmes. TL;DR

Holmes is a standalone service used to determine what type (i.e. Book, Character or a Movie) a given article is about.

Technical documentation (a bit sparse, though) is available at https://one.wikia-inc.com/wiki/Engineering/Holmes

Slightly longer versionWhy?

During work on Sony - related projects it occurred to the team that our Content APIs have an unacceptably high false positive ratio. Example: when looking for an episode of a TV show, the API would return a character of the same name. Since the Content APIs employ searches over our Solr indexes, a filtering mechanism was created - article types. What?

Article type determines what the article is about. It is assigned to an article through machine learning-based classification. How?

We used Weka library ( http://www.cs.waikato.ac.nz/ml/weka/ ) for the purpose of training classifiers. The Holmes itself is composed of three components: classifiers (the core functionality) and two interfaces exposing the classifiers: HTTP service and RabbitMQ consumer. Holmes can be deployed both to prod and dev environment using Wikia=E2=80=99s deploy tool= s (see documentation on One).


Hope this is useful!

Questions / comments are welcome.


Cheers,

Sebastian & Data/API team


--=20 SEBASTIAN MARZJAN

LEAD SOFTWARE ENGINEER, DATA/API TEAM



  • Videogames / Entertainment /
Lifestyle.................................................................=

.....500 3rd Street, San Francisco, CA 94107E sebastianm@wikia-inc.com <sebastianm@wikia-inc.com>WEB WIKIA <http://wikia.com/>

Advertisement