Authors: Andrea Yoss and Caroline Harrison
Since many of the best models use millions of training instances and take weeks to run on robust computational resources, it is difficult for the everyday deep learning enthusiast to train comparable models from scratch. Fortunately, we can incorporate parts of those models into a completely different and domain specific model.
By using a pre-trained model, one can effectively transfer the learning from one model to another — a technique known as Transfer Learning — often used for domain adaptation and strengthening the accuracy of a model that is going to be trained on…
GetOldTweets3 is a free Python 3 library that allows you to scrape data from twitter without requiring any API keys. It also allows you to scrape historical tweets > 1 week old, which you cannot do with the Twitter API.
Using GetOldTweets3, you can scrape tweets using a variety of search parameters such as start/ end dates, username(s), text query search, and reference location area. Additionally, you can specify which tweet attributes you would like to include. Some attributes include: username, tweet text, date, retweets and hashtags.
Let’s run through some examples to illustrate the different ways we can use…
“Finding patterns is easy in any kind of data-rich environment… the key is in determining whether the patterns represent noise or signal.”
— Nate Silver
A typical issue students run into when fitting a model is balancing the model’s bias with its variance, known as the bias-variance tradeoff.
Bias is essentially a measure of “badness” — the higher the bias, the worse your model does when using the very data it was trained on. Typically, a model has high bias and is considered “underfit” when the model performs poorly on the training data because it is does not have enough…
According to the World Health Organization (WHO), COVID-19, or “coronavirus disease 2019,” is a respiratory illness caused by a newly discovered coronavirus that is believed to have originated in Wuhan City, China, in December 2019. The virus has since spread to most of the globe, with confirmed cases increasing every day.
There are a variety of approaches you can use when working with time series data, such as linear models, ARIMA models, exponential smoothing methods, and recurrent neural networks (RNNs). In this post, I will focus on an extension of the ARIMA model that also accounts for seasonality in the data: the SARIMA model.
A SARIMA model consists of four pieces,