Saturday, 24 October 2015

Announcing an unexpected break

I know that it's be only two days that I have started the project but in my defense I have done 8% of the work. Now, to the point. There is a last company interview that I got to attend and it's gonna be the deciding factor. Okay, it's not fancy and my chance of winning is statistically very lean. I gotta give some credit to my talent and I gotta try. So, until October 29th, there's gonna be no more posts, sorry to say that.

Hope I'll come back all joyful to resume this project. I am looking forward to complete this by January. With that said, I am suspecting a rain tonight, if you know what I mean. Until next time, take care.

Peace
Horopter

Cakewalk... I guess!

See, clearly I don't know where I am going with this. The data sample seems to be pretty interesting considering the fact that now I have literally 888 samples of data i.e. 888 different movies to analyze, all between 2009-2014 per my calculation. These are the bollywood/hollywood-dubbed/tollywood-copied/kollywood-inspired/sandalwood, well created movies.

All these movies have anywhere between 1-15 weeks of playtime which is in my theory good, since I have a larger dataset to deal with and bad, since I have a lot of work to do. Both my project partners have o idea what's going on. I have taken one of them to code in lieu of me. That guy was delayed in placement. Now that he's placed, I can count on him. Second person is well, a good person and lazy at that. I like lazy people because duh.. I am one of them.

Anyway, I have a lot to consume and analyze. The sheer effort is worth giving me an Oscar.
My bad, not Oscar, IEEE publication, which is kinda like an academy award in my field, an academic award. In any case, the work doesn't matter, the result does. I have to make a parallel between economy and psychology and that one is tough bread to break.

Today, I have separated the list I derived into files based on the movie names. It was a cakewalk... I guess. I have a list of movie names which is good, as I need to search trends with those names. The next very important step is I guess, collecting information , i.e. trends based on the week. Rgoogletrends should do the charm but I don't know what will happen if I do the same in twitter. In twitter I have two policies, one is searchTwitter option which I think is great at giving me sentimental analysis part and then I have the actual trends part to check the authenticity of my search which is not saying much.

Once I have google trends, my project will split into two branches, the people search part, which is important as it tells me the hype created during that point and twitter part where I can know the reaction of the people to the given movie. It will certainly take a month or two and I am counting on it. Will it be a cakewalk?

Hare Ram and there goes the plan,
until next time
Horopter.

Thursday, 22 October 2015

Data collection update

Horopter log : 23 Oct 2015 10:50 AM IST

Summary: Web scraping is tough for new people.

It took 7 hours to scrape the data from boxoffice .com and the backend process of generating the pattern took another 8 hours since I am quite new to this job. Now I have a set of tables containing the information I need and I am yet to figure out a way to merge the tables and group them by the movies. Once that happens, I can launch the trends. I was mistaken when I was going through the twitteR package. It sure does have a trends option making my analysis a bit easy to go. Furthermore the google trends data is good looking. I am interested in getting data from multiple sources, mostly facebook since it is more intimate than twitter.

Peace
Horopter out.

Wednesday, 21 October 2015

A new beginning

I am not gonna give a long lecture now like I always do. I have got tools and I have a non-functioning team. I mean at least now I have a team. I have projects hosted online and work that can be shown as my own.

I have realized two things. First, everyone copies code and conceals it. Take my dear friend "The topper" for example. The thing that popped into my mind is that where did he get the training data. A little more paranoia and research tells me that he paid for getting stuff. Now, I don't blame that guy for being so "to himself". He did not do original research, so what a big deal? That told me that people conceal when they have nothing to offer. Come on, you can't hide sugar, ants always find it, trust me, they do.

Second is that I talk too much which is why I learnt to be professional. The last post was a month ago and not much of progress has been done on the project site. On the brighter note, I got tools and stuff (aforementioned). I have decided to give it a full go.

Lastly, I have come to terms with my body. My conscience is far more beautiful than my mind or body. I am grateful for that. Girls are a bye-bye now for a while. I have got a new perspective on CS department which is good, I guess. Developing my dream projects, however ridiculous they are, is gonna happen soon.

I know it sounds cliche but recent change of events are going to define a new phase. With that said, I am gonna update weekly or biweekly posts on the project. I am happy to have this life.

See you again,
Horopter