Author
Sebastian Marzjan
sebastianm@wikia-inc.com
Date
Links
- https://one.wikia-inc.com/wiki/Engineering/Holmes
- http://www.cs.waikato.ac.nz/ml/weka/
- http://wikia.com/
- http://www.facebook.com/wikia
- http://twitter.com/wikia>*
Text
Hi Team,
Today I=E2=80=99d like to share a bit of knowledge about Holmes.
TL;DR
Holmes is a standalone service used to determine what type (i.e. Book, Character or a Movie) a given article is about.
Technical documentation (a bit sparse, though) is available at https://one.wikia-inc.com/wiki/Engineering/Holmes
Slightly longer versionWhy?
During work on Sony - related projects it occurred to the team that our Content APIs have an unacceptably high false positive ratio. Example: when looking for an episode of a TV show, the API would return a character of the same name. Since the Content APIs employ searches over our Solr indexes, a filtering mechanism was created - article types. What?
Article type determines what the article is about. It is assigned to an article through machine learning-based classification. How?
We used Weka library ( http://www.cs.waikato.ac.nz/ml/weka/ ) for the purpose of training classifiers. The Holmes itself is composed of three components: classifiers (the core functionality) and two interfaces exposing the classifiers: HTTP service and RabbitMQ consumer. Holmes can be deployed both to prod and dev environment using Wikia=E2=80=99s deploy tool= s (see documentation on One).
Hope this is useful!
Questions / comments are welcome.
Cheers,
Sebastian & Data/API team
--=20 SEBASTIAN MARZJAN
LEAD SOFTWARE ENGINEER, DATA/API TEAM
- Videogames / Entertainment /
Lifestyle.................................................................=
.....500 3rd Street, San Francisco, CA 94107E sebastianm@wikia-inc.com <sebastianm@wikia-inc.com>WEB WIKIA <http://wikia.com/>