Overview¶
A set of functions that uses scikit-learn to conduct a TF-IDF analysis to isolate keywords from event-based documents. It answers the following questions:
- What keywords represent a particular period of content?
- What keywords represent a particular group of content from a particular period?
It assumes you have:
- imported your corpus as a pandas DataFrame,
- included metadata information, such as a list of dates and list of groups to reorganize your corpus, and
- pre-processed your documents.
It functions only with Python 3.x and is not backwards-compatible.
Warning: evekeys performs little to no custom error-handling, so make sure your inputs are formatted properly. If you have questions, please let me know via email.
System requirements¶
- pandas
- sklearn
- tqdm
Installation¶
pip install evekeys
Known Issues or Limitations¶
- Please contact me if you discover any issues.
Example notebooks¶
- Coming soon.