amathew
Distinguished Member
- Joined
- Nov 4, 2011
- Messages
- 1,501
- Reaction score
- 228
Let's talk statistics, 'big data', and data mining in here.
What kind of work do you do? What problems do you work on? What tools do you use? Random thoughts? Book suggestions or blog posts? etc
Whatever, as long it's related to statistics or statistical computing.
I'm currently working on forecasting leads and sales for an automobile manufacturer and also on trying to apply association rule algorithms to clickstream data to identify common trends in consumer browsing behavior on a website. Besides that, I do a lot of natural language processing of survey verbatims for the purpose of classification and extracting common theme in those different classifications.
By and large, I use R, MySQL, and Python for all my analysis. Occasionally, I'll use Tableau for creating visualizations. In my old job, had some Hadoop and NoSql exposure but I'm now working with much smaller data sets (3 to 5 gb data files). I'd much rather work with 'small data' than 'big data.'
Blog posts I'm enjoying:
http://prdeepakbabu.wordpress.com/2010/02/24/association-rule-mining/
http://blog.revolutionanalytics.com/2014/03/r-and-hidden-markov-models.html
What kind of work do you do? What problems do you work on? What tools do you use? Random thoughts? Book suggestions or blog posts? etc
Whatever, as long it's related to statistics or statistical computing.
I'm currently working on forecasting leads and sales for an automobile manufacturer and also on trying to apply association rule algorithms to clickstream data to identify common trends in consumer browsing behavior on a website. Besides that, I do a lot of natural language processing of survey verbatims for the purpose of classification and extracting common theme in those different classifications.
By and large, I use R, MySQL, and Python for all my analysis. Occasionally, I'll use Tableau for creating visualizations. In my old job, had some Hadoop and NoSql exposure but I'm now working with much smaller data sets (3 to 5 gb data files). I'd much rather work with 'small data' than 'big data.'
Blog posts I'm enjoying:
http://prdeepakbabu.wordpress.com/2010/02/24/association-rule-mining/
http://blog.revolutionanalytics.com/2014/03/r-and-hidden-markov-models.html
Last edited: