SB-10k: German Sentiment Corpus

SB-10k is a publicly available corpus that contains 9738 German tweets, each labeled by 3 annotators with “positive”, “negative”, “neutral”, “mixed”, or “unknown”. It was created by SpinningBytes in collaboration with the Zurich University of Applied Sciences (ZHAW).
Details and Download

Supplementary Material for Publications

We provide supplementary material for some of our publications, including data and code. Please click the publication title to access the material.

Word Embeddings

Word embeddings are used in Natural Language Processing (NLP) to map words to vector representations. They are used, for instance, in deep learning algorithms for named entity extraction, sentiment analysis or chatbots.
We provide publicly available word embeddings for various languages, including English, German, French and several other languages.
Details and Download

Contact form