How to Describe your Kickstarter Campaign (from a Data Science perspective)

Isaac Chan
4 min readApr 22, 2020

This article is on Exploratory Data Analysis on historical Kickstarter campaign descriptions based on work for a company I’ve recently completed and gives some insights on how you should describe your Kickstarter campaign.

Source: ICO Partners

I just finished building and deploying a Machine Learning model that predicts the probability of a Kickstarter campaign succeeding based simply on how the campaign had been described by its creators.

I decided to go ahead and write this short article, which contains some Exploratory Data Analysis on the data I had used to train the model. This article is geared towards entrepreneurs who want to leverage on Kickstarter and data enthusiasts who might want to find out some introductory elements of Natural Language Processing.

The data used were from Kickstarter campaigns held from January to March of 2020.

How long should your campaign description be?

This chart shows the frequency of descriptions that contain a certain number of words. We can see that most descriptions are about 15 to 25 words long. Few descriptions exceed 30 words while there seems to be descriptions that have less than 5 words too!

This next visualisation shows what proportion of descriptions contain how many words. Descriptions that are 20 to 22 words long make up almost 30% of all campaign descriptions.

Does description length make a difference?

Deeper statistical analysis can be made on whether length makes a statistical significance, but based on this visualisation it seems that Failed Campaigns might have slightly longer descriptions. However, I’m not confident of this conclusion at all, as the difference seems quite minute. Moreover, campaign descriptions that are too long may be unhelpful too, so the distribution of successes and failures may differ across exact word counts.

What are the most common words used?

As we can see, words like “new” and “help” are the most common, followed by words like “album” and “book”.

What are the most common word-pairs used?

This visualisation looks at pairs of words, rather than just single words seen earlier. Again, “need help” seems like a common phrase to use. We can also deduce that the many campaigns are launched to raise funds for short films and enamel pins.

Which pair words lead to the most success?

This visualisation is interesting, since it reveals to us that campaigns related to enamel pins actually perform exceedingly well. Almost 90% of enamel pin campaigns are successfully funded. Campaigns related to “new albums” also do very well. Conversely, suggesting that you “need help” in your description actually lowers your chances of succeeding!

Most Important Features in the Model

These words are the most important words used by the Machine Learning model to discriminate between successful and failed campaigns.

Words That Have Higher Proportion of Success

Campaigns that used these words had the highest proportion of success . These values were derived from the highest coefficients of the linear Logistic Regression model. We can see that overall, enamel pins do very well in campaigns, as was shown previously too.

However, one must note that correlation does not imply causation here. Descriptions that include certain words like “enamel pin”, also have other words used to describe the overall product. Hence, simply throwing the word “pin” into a gibberish description won’t work either.

Also campaigns for enamel pins tend to have lower funding goals too. This might be why such campaigns are more successful. Regardless, one needs to take into context the whole description before a prediction on the outcome can be made.

Words That Have Lowest Proportion of Success

These words have negative scores in the model, perhaps suggesting that using them leads to lower chances of success. Interestingly, using the word “similar” leads to poor campaign results. Campaigns about food, youtube and candles also perform poorly.

As with above, just because such a word might be present does not imply that it causes the campaign to fail.

To Conclude…

How you describe your campaign obviously plays a big role in whether it becomes successfully funded. Leveraging analysis on such text data can be especially useful for entrepreneurs, giving insights that the Kickstarter domain doesn’t provide.

--

--

Isaac Chan

An NLP Data Scientist always seeking to improve his skills :)