I use data science to help people identify legislation they may be interested in given their location, their legislators, and their policy interests.
I use data science to answer the question: What (state) legislation should I pay attention to?
This might include legislation that:
This information comes from legislative text and metadata pertaining to bills from OpenStates, a database providing comprehensive information related to state legislators and legislation for all 50 states, including bill text, topics, sponsors, progress through committees, amendments, vote tallies and votes by individual legislators.
Natural Language Processing is a field of data science concerned with processing text as data. I use Python modules (spaCy, NLTK, scikit-learn) to perform common NLP tasks such as:
I then create a numeric representation of text that can be used to create data visualizations, calculate statistics, and as the input to a machine learning model. The goal in creating an ML model will be to answer some of these questions prospectively (e.g.: can I predict which bills be contentious before they are actually voted on so that people might call their legislator and express an opinion?).
In answering my substantive question, I illustrate uses of NLP modules so that attendees will be able to apply these tools to other text problems they may be interested in.