89% of the data was screened out of the analysis!
The project for this course was to design a research project and perform a structured, yet not systematic literature review for the paper. I called a friend who owns a local start-up in town and asked whether there was any data I could analyse for him. He wanted better insights into the app store reviews to help him and his team shape their product roadmap.
His initial guidance to me was quite general: the study should be “exploratory in order to learn what kind of features a good review are bringing up, and to find correlations between certain product features and satisfaction.”
My research questions first sought to find any best practices in terms of analysing app store reviews, then dug into specifically which questions I wanted to answer with the statistics I would come up with. I received a year’s worth of Apple and Google app store data to analyse.
My findings were proprietary so I can only generalise here and I can’t upload the entire paper:
- Most of the time spent for the data analysis was in the “data preprocessing” activity, namely cleansing the data according to given criteria, weeding out suspected “spam,” picking out when the reviewer might be discussing a certain feature, and assigning this portion of the review to that feature. I was only one researcher and did not have the benefit of any software to help me with this.
- 15% of Apple data and 89% of Google (Android device) data was excluded during the preprocessing phase.
- The client’s app store data was consistent with what was reported in the literature: Generally the ratings were four- or five-stars, longer reviews tended to be negative, and Apple reviews were of a higher quality (i.e. much less suspected spam or other reasons to screen out) compared to the Google reviews.
- Apple customers mentioned more specific features in their reviews but Google customers offered suggestions more often.
- I created 42 possible features to code against. Apple users had about 85% of their comments toward six features, and Google Android users did the same over the top 14 features.
I delivered basic descriptive statistics to the client team (i.e. means and standard deviations) for each research question in a presentation and they found it really helpful. Future research in this topic should definitely explore automation and additional researchers to help code the feature mentions neutrally.