Twitter: newmarrk

Skip navigation

Poster RecSys 2012

RecSys 2012

Tomorrow I leave for Dublin to attend RecSys2012. Together with Dirk Bollen and Martijn Willemsen I performed a study on how memory effects influence the ratings people will give. The main finding is that as time passes between watching and rating, people tend to give less extreme ratings. Presumably this is because people forget the [...]

Worlds Apart

Worlds apart

-“You want to be a PhD student because you are afraid to leave the safe haven of university.” -“You go into corporate life because you’re not smart enough to be in academia.” Two sentences I have heard illustrating how some highlight differences between these the academic and corporate worlds. However, these two worlds can benefit [...]

DSC_3137-15

Turkey

2 Weeks of Turkey were amazing!

-“You want to be a PhD student because you are afraid to leave the safe haven of university.” -“You go into corporate life because you’re not smart enough to be in academia.” Two sentences I have heard illustrating how some highlight differences between these the academic and corporate worlds. However, these two worlds can benefit from each other greatly.

Too fundamental

In academia PhD students – for example – devote four years of their professional life to the pursuit of knowledge. Regardless of how proficient the student is, practical applicability is not a requirement of their track. New, fundamental knowledge that can later be used, or ignored, is the main product: Innovation in its purest form.

However, this approach runs the risk of losing connection with the ‘real world’. The knowledge may end up in a dissertation that is cited a number of times within a confined area of research and that’s it.

Risky Innovation

In the corporate world innovation is a business risk. Only a percentage of new ideas lead to profitable applications so it is often smart to be a quick follower rather than a first mover. This risk creates a trade-off between exploration (or the aforementioned innovation) and exploitation (doing everyday business producing revenue).

Exploration is an activity you can only engage in if revenues generated through exploitation cover the risk of time wasted on explorative work that does not lead to future revenues. The result of this risk is that companies lack an incentive to generate new knowledge or ideas even though they often face questions that go beyond every-day activities.

Combining the two

The two worlds described in the previous paragraphs can benefit from each other tremendously. If the two can meet somewhere in the middle, collaborations may result in applicable innovations that contribute to the academic state of the art.

In my current job I am expected to embody this combination, which is translated in its structure. I work 40% as a Technical Consultant, exploitation as described above. I work 20% as a PhD student. The remaining 40% I spend in between these two worlds.

An example of the work in between these two worlds is the current master thesis I am supervising. In this study the owners of the website hardware.info acknowledged that the audience of their website may have changed over time. They however do not have the means (time and knowledge) to investigate this themselves. Drawing on the body of scientific literature provided a way to answer this question, which resulted in a study on their website that answered the original question and an additional, academically relevant question here broadly described as “what can we tell from objective behavior of visitors on websites?”.

This first step in combining corporate questions and my own academic research illustrates how a “real-world” question can be answered in a way that is beneficial for both a company and academia.

In general

This type of collaborations is the future. Budget deficits in the university departments increase the need for external funding. The need for innovation in companies increases. And both these needs create room for negotiation for the two to meet each other in the middle.

And both worlds can benefit from these collaborations: Companies have a way to access innovative power with a limited investment, and by ‘out-sourcing’ the research they provide resources to an academic partner. Apart from monetary, these resources also consist of the opportunity of moving research away from the university campus into society, increasing ecological validity and thus scientific value.

For universities knowledge valorization is becoming a prominent issue on the agenda, with the goal of moving academic research towards the market, increasing visibility and possible revenues. Collaborations between companies and universities guarantee this knowledge valorization.

Apart from benefits there are challenges. One example is the possible issue of intellectual property. A university is appraised on publications that may result in a company losing its leading position or first mover’s advantage. Another example is the different standards (i.e. corporate research reports are not assessed with the same level of scrutiny as academic publications), thus finding the right balance here is another challenge.

Even though corporations and academia are said to be worlds apart, instantiating collaborations between them in the right way will advance both worlds. Acknowledging the benefits and challenges and simply trying collaborations in this form will be way to move both of these worlds forward.

Tomorrow I leave for Dublin to attend RecSys2012. Together with Dirk Bollen and Martijn Willemsen I performed a study on how memory effects influence the ratings people will give. The main finding is that as time passes between watching and rating, people tend to give less extreme ratings. Presumably this is because people forget the details on how or why they liked or disliked the movie.

Using our psychological background we were able to predict that submitted ratings would change over time. The next step is researching these effects more in-depth and using it to better model user preferences. The poster that I will be presenting is below. A link to the proceedings will be added when they are published.

2 Weeks of Turkey were amazing!

DSC_2638-1

DSC_2638-1

DSC_2638-1

DSC_3137-15

DSC_3137-15

DSC_3137-15

One of the presents I got for myself after graduation is a set of turntables and a proper mixer. I’ve been playing around a little with it and finally recorded my first mix. It’s rough, but I’m happy enough about it to post it.

Tracklist is:

  1. Mala – Level 9
  2. Skream – Snarled
  3. Killawatt – 71
  4. Skream – Midnight Request Line
  5. Coki – Badman Place (DMZ Remix)
  6. Skream – X-mas Day Swagger
  7. Hudson Mohawke – Thunder Bay
  8. Dismantle – Computation

The first milestone in my PhD project is nearly reached. For the continuation of my research it is necessary to have a recommender system that I can easily manipulate for research.

Since I have already been using the full MyMedia, this would be maybe the easiest way. However, the entire framework that was delivered by this project is somewhat of an overkill for my needs. This project however was forked into MyMediaLite, a lightweight version of the framework which mainly consists of the different recommendation engines.

The drawback of MyMediaLite however is that it is not readily usable as a web service. Enter ServiceStack… This library allows for easy deployment of a .NET application as a web service.

The combination resulted in a .NET application that can be deployed on a web server. Right now the only methods provided are the submitting of ratings and receiving recommendations, both as REST interfaces. Recommendations are returned as JSON object. To do this a number of things had to be changed, because of the lack of thread safety in MyMediaLite. The main change is that ratings are stored in a queue that is processed periodically. Without this addition the data was not consistent, as there is some mapping going on from external ID’s (used in the http requests) and internal ID’s (used in the recommender engine).

Yesterday and today I have been stresstesting the application through Blitz.io and the numbers are quite surprising. The report shows that apparently the application can take 250hits/s easily.

It does have some startup issues, so going from 0 to 250 hits instantaneously causes timeouts. Sadly I cannot try any higher loads with my Blitz account.

Response Time by Concurrent Users

So for now the idea is creating a user interface around this application, such that research can continue. We will probably still be using the 10M dataset of GroupLens, as we have done most of our research on this dataset. But with full control of the application it is also time to think of new ideas like visualizing preferences and/or movies, similar to what I have done in my thesis. Time will tell what way to go…