Senior Data Science Engineer

Location: San Francisco, CA
Date Posted: 10-13-2016
About Protagonist Technology 

Protagonist Technology is not an ordinary analytics company, or an ordinary software startup. At the core of our unique solution is a rapidly emerging technology category called Narrative Analytics. This unique human-in-the-loop technology, where human analysts work in tandem with the most powerful artificial intelligence algorithms, goes beyond basic NLP techniques like sentiment analysis, text categorization, etc.

Narratives are underlying beliefs and attitudes that drive human behavior. In commerce, politics, and society, media voices are multiplying, facts are losing their currency, and narratives are filling the void to drive people’s attitudes and behavior. Global corporations and foundations contend with narrative challenges that pose great opportunity and also existential challenges to their business models. These organizations need new ways to change the conversation, build their brand and reputation, and understand and activate their customers. We are the only company to bring a robust data-driven solution to the analysis of narratives. Our growing customer list, including Citibank, HPE, Microsoft, Starbucks, WB, State Street Global Advisors, Mars, General Mills, Pfizer, Charles Schwab, PG&E, VMWare, The Gates Foundation, Rockefeller Foundation, and more, all provide validation to our unique approach. .

At Protagonist Technology we have 10 years of experience with qualitative (human) analysis of narratives and have deep in-house expertise of how to identify, analyze, structure and understand narratives in news and social media. We marry that qualitative expertise with deep expertise in NLP and machine learning (deep learning, LSTM, etc.), to bring scale, efficiency, and new insight to our long-standing methods. Through this process we deliver a more nuanced, complex and valuable set of insights to our customers than traditional NLP and analytics companies. This vast knowledge and background in narratives positions Protagonist Technology as a market and thought leader in this emerging category that delivers the richness of insight of traditional research, with the scale and efficiency of social media monitoring.

​We are a revenue-generating start-up, backed by respected investors. We are looking for high-energy team members who embody our values of drive, generosity, exploration, and joy.  

Reporting to the Director of Data Science, the Senior Data Science Engineer is responsible for building the Narrative Analytics™ tools and capabilities that drive automation, using machine learning and network science techniques, to achieve operational scale and develop new solutions by analyzing textual/unstructured content. This person will also be responsible for uncovering new insights (causation, correlation, etc.) in the underlying data (both numeric and textual) by applying data engineering practices, statistical modeling, and effective visualizations to present those insights. The role will require flexibility in learning new statistical and mathematical models, designing new experiments, creating and fine-tuning models to deliver narrative analytics process automation, and develop new insightful metrics. As a senior DS engineer, this person will be mentor to other junior DS engineers and will actively work with client facing teams in understanding pain points and brainstorming solutions.

The Senior Data Science Engineer in this role will work through quite a few unknowns and inadequate data points that will require him/her to calibrate the experiments and make and effectively communicate choices and tradeoffs to the Product Manager and other members of his team. The person also need to be aware of the vendor landscape and open source tools and technologies to leverage existing capabilities and build new capabilities on top of what’s available.  
Since the path to developing the components may be difficult, Senior Data Science Engineer will need to be innovative, tenacious, and solution-oriented to drive positive outcomes. This position is based in our San Francisco office.
Primary Responsibilities
  • Conduct data analysis, data munging, design new experiments, test hypothesis, and apply complex algorithms on textual (unstructured/semi-structured) data to create optimal and semantically similar content groupings/clustering
  • Design experiments, create and train models, and validate findings on supervised learning algorithms
  • Design new/refine existing metrics using underlying data enrichments and other metadata that answers key narrative impact and monitoring questions
  • Work with the Product Managers and other business stakeholders in understanding their business needs and translate into actionable results (either as metrics or better narrative analytics capabilities)
  • Design new algorithms and advancing the field of quantitative narrative analytics
  • Work with data scientists and machine learning professional communities to contribute to and benefit from the research in the AI space
  • Work with Computational Linguists in bringing machine learning capabilities to their endeavors in natural language understanding and narrative analysis
  • Ensure that data sets are reliable, accurate, and trustworthy for model development and optimization
  • Engage with the narrative technology platform team (product and technology) to drive integration of tools, solutions, and/or predictive capabilities onto the narrative technology platform and provide requirements for their data needs
  • Regularly provide status updates on the development of narrative analysis capabilities, metrics, and predictive models to key stakeholders
  • Mentor and guide junior members of data science team
Must haves:
  • 6+ years as a Data Scientist or Machine Learning engineer and total of 10+ years of experience as data engineer
  • Masters/PhD in Computer Science, Math, Physics, Statistics, or a related engineering discipline
  • Strong programming skills in Python, Java or C/C++.
  • Experience with SQL or No-SQL databases
  • Experience using various statistical analysis tools, such as R, Matlab, etc.
  • Experience with various open source machine learning libraries
  • Experience working with Deep Learning techniques (LSTM, CNN, etc.)
  • Expertise in clustering techniques, generative techniques and other unsupervised methods 
  • Must be referred to as a "genius" by your friends (your mom doesn't count)  
Nice to haves:
  • Background in text analysis/NLP tools and techniques (sentiment, NER, topic modeling, etc)
  • Experience with reporting and visualization, using Tableau or other data visualization tools to showcase the results of the analysis
  • Experience with data platforms, such as Spark, Hive, etc.
  • Excellent interpersonal and communication skills, including the ability to describe the logic and implications of complex models to stakeholders (in layperson terms)
  • Collaborative team player, with empathy for clients and colleagues
  • Self-starter, who can proactively identify and solve problems
  • Strong interest in academic research and problem solving in a collaborative, team environment
this job portal is powered by CATS