Unix Spiders: October 2011

I attended this event in the afternoon of Wednesday 19th October 2011 at City University London. I found it a very stimulating and thought provoking forum, so here are my thoughts based on the notes I took. Mike Hawkes from the MDA introduced the event and we had four speakers, 2 focused on more technical issues (Lee/Guest and Garner), two focused more on content (Holden and Reynolds), although by the very nature of the subject discussed there were overlaps between the two in the talks.

Paul Lee and Matt Guest – Mobile Data Consumption Trends

The focus of this talk was a look forward to what the trends are in mobile to the year 2030. Two aspects of mobile were highlighted: networks and devices. These were tackled in turn. The methodology used to gather the data was briefly described e.g. inputs from a variety of sources, both organizational type and geographical. The focus on the talk was very much on what people want rather than what the technology can do (perhaps the latter has been more prominent in the past than it should have been).

1. Networks

A number of problems were outline, and tensions between various aspects were stated. Mobile networks are likely to be become much more congested, and there may not be sufficient scope to increase bandwidth due to constraints on the spectrum (i.e. available) and planning permission (putting up more poles). A wide variety of different types of devices is likely to become available, meaning that the network will increasingly need to deal with device heterogeneity. However there is also the potential that 4G/5G will allow networks to carry even more data with faster upload and download speeds. Users will be able to use their devices more seamlessly than previously with blending of wi-fi and cellular networks and web/wi-fi bypass. An example of this would be a device which can move between cellular, wi-fi and short range networks. The type of applications on these three levels will differ, as speed effects choice. Due to the business model of mobile operators, there may well be resistance against seamless use of devices. Most people use their mobile broadband at home, therefore some usage may move back to DSL/Cable. There are positives and negatives in using wi-fi, but the former outweigh the latter. It's a pretty complicated picture to me.

It is clear that mobile data usage is increasing at an exponential rate, leading to increasing competition for bandwidth. This is because of more data usage and many more types of mobile devices available (see above). This problem is clearly going magnify with time.

2. Devices

As stated above the numbers and types of mobile devices will increase substantially, and will become more specialized. Ubiquitous computing is becoming quite common. The ideal of a single integrated device will never happen in the speakers view. Single device ownership dominates, however users will accumulate more devices due to various factors such as Moores Law, falling average selling price (ASP’s), groovy new technology (who will be the new Steve Jobs?) etc. Differing user needs for mobile also contribute to this diversity. New ways of connecting devices will become available. However users will still aspire to owning fewer devices. How this tension will work out in practice remains to be seen. Devices will improve in a number of different ways, however battery technology is holding things back (as I found recently with my MacBook - battery is knackered and it would cost me £100 to replace it). When oh when will we do something about batteries! Speed will therefore increase, but users are focused on cost and power consumption. Not all devices will be connected e.g. your fridge (why!!!!???). It is suggested that with the increase in ubiquitous devices, each person will have around 50 devices. This is contestable.

Martin Garner – Phones and tablets usage

Martin focused on tablet usage in terms of mobile devices (e.g. the iPad). The tablet is regarded as a key device, one that may well be the driver for future mobile device usage (if not computing usage). The evidence for the talks was taken from a sample of early adopters of the technology.

What is a tablet computer? What is it used for? It can be a mobile internet device, a companion to the TV/living room (for entertainment), as a PC substitute or a new category of computing (according to 45% of the survey). User behavior is adapting to ownership of these technologies. The PC is the main device for web usage, however the range of devices which are available is beginning to change this behavior e.g. using email, social networking, twittering etc. According to the evidence the PC is beginning to suffer because of this change. Which rather raises the question – will tablets take over from PC’s? Not IHMO – unless they come up with a 27 inch screen tablet (like my iMac), which could fold so I could fit it in my pocket…. Goodness me its Minority Report all over again! That never happened did it…..?

The tablet market is growing very strongly, and there is a strong burst of sales around xmas. The iPad2 dominates currently (70% of sales August 2011), however sales from other manufacturers are starting to pick up. In Europe the UK, is the biggest market. It seemed to me that sales figures for countries who have a very serious debt crisis are substantially diminished e.g. Ireland, Greece – this could partly be due to the crisis and partly due to the fact that these countries tend not to be early adopters like the UK (established by me after I noticed this in the figures presented and asked Martin a question about it). There is no argument on how big the market is – 47 million devices now, increasing to 80 million shortly to around 1 billion over next four years. This means big money! See below in my notes on Steve Reynolds talk on cross-platform computing for mobiles, and who might win.

Tablets therefore could be the future of mass market computing. Tablet sales are not quite at the level of PC sales, but are increasing rapidly (see above). Microsoft better be worried about this. In terms of who is buying the ratio is 65% male to 35% female. Unsurprisingly as tablets are more expensive than smartphones, ownership maps to wealth. Younger people tend to dominate smartphone ownership (the digital natives). Why to people buy tablets in the first place? Brand (i.e. apple) dominates, but other factors include bigger screen with more light. Other manufacturers are beginning to take this seriously and have a chance (see above). Both wi-fi and 3G tablets will available, the choice of which will depend on infrastructure and running costs. Around 60% of users have 3G, but around half do not use all of the functionality. Tablets are used mostly as leisure devices, but around 40% use them for work. There appears to be a barrier set by mobile operators (obviously defending their territory), which prevents more tablet usage – the current tariff model for mobile devices does not work for tablets.

The outlook is as follows. There will be a huge increase in supply and uptake. Users will continue to be frustrated with 3G tariffs, so wi-fi will continue to drive the tablet market. Apple dominates, and will continue to dominate. We are in the first stages of a whole new market, and there is some evidence that other manufacturers will grab back some of the market share.

Windsor Holden – Mobile Publishing

Windsor gave a talk more focused on content, e.g. using eBooks or eNewpapers through mobile devices. He explained the drivers for the ePublishing market, such as eReader growth (Amazon Kindle uptake) and availability of content (Amazon has around 1 million eBooks available for purchase and download at the time of writing).

He said a few words on the EPUB standard, which is heavily used by Amazon, and allows for the access to content in other ways (other than the Kindle in this case).

The rise of Apples App Store is having a significant effect on the sector. There is a wide diversity of applications including eBook readers and ePrint for Apple devices (such as the iPad). Users are happy to pay for applications, outlay per user is around $14 per head. There is a market for users to consume content through App Store applications. Therefore there is the potential to access content across devices. The following value chain was outlined:

Author -> Publisher -> Retail Platform -> Device -> User
Windsor’s talk focused on the “Retail Platform -> Device” part of this value chain, with the App Store/iPad and Amazon Kindle given as examples. These technologies/platforms are having a significant impact on the publishing industry. Local newspapers are dying on their feet, and bookstores are closing at an alarming rate – from June 2010 to June 2011 there has been a 26% reduction in bookstores.

This in part can be explained by business models used so far. Amazon Kindle made a loss initially, but is now beginning to gain market share and hence a profit. Retailers are therefore looking to move into the eReader space e.g. W.H. Smith, Waterstones as well as Barnes and Noble with their ‘Nook’ reader. Many publishers are moving toward a digital only environment. Producing books/newspapers costs money, and if folk don’t buy them anymore then something else needs to be done e.g. new business models to support such content providers.

A number of business models were outlined:

Pay per download: e.g. Kindle. This is the prevalent method.
Subscription: academic publishers providing online access to whole journals.
Free: this includes open access journals and author pays models.

Publishers are really worried about how they can make money in the new digitized environment. The web as a disruptive technology initiated the process of undermining current business models, and it seems to me that mobile will exacerbate this.

A good example of this is newspapers. Sales have been in decline for a number of years, but this is accelerating with the uptake of smartphones. This is impacting on advertising revenue in a very big way - less readers means advertisers are less willing to shell out money to newspapers for ad space. Google started this with adwords and searches – why pay out for a newspaper ad (with little data to collect) when you can get information on click-throughs for searches and pay according to how many users have actually viewed your ad. There is a direct relation between click-throughs and you income and you can measure it. Can’t do this with newspaper/magazine ads. The rise in cost base is impacting of resource issues – a double whammy!

Windsor talked about the price point conundrum. Users are used to getting news free of charge. There is clearly a negative impact in losing eyeballs on your site (see point on advertising above). When the times put up their paywall in June 2010 they lost 90% of their online readership – they are not getting payback on this. It is the view of a number of observers that print newspapers may not have much of a future, if at all. However it may be possible to sustain digital only editions.

How can publishers successfully migrate to the new environment? They will need to offer something extra, over and above what can be made available on print versions e.g. audio/video (which actually does happen now). It did make me wonder whether the lines between media outlets e.g. broadcasters such as the BBC and publishers such as News International will blur somewhat. Publishers will need to leverage the strength of existing brands e.g. the times. These will need to available on all kinds of mobile devices e.g. smartphones, tablets etc.

Newspapers will need to compete with something called the NeoNewspaper – first time I’ve come across this term. If eBook sales continue to rise, the range of devices will increase – perhaps this will be an opportunity rather than a threat?

Steve Reynolds – Consumerisation of IT: the mobile perspective

Rob Bamforth was due to give this talk, but could not do so due to personal reasons.

The focus of the talk was on how consumer technology such as tablets and smartphones can be used in the enterprise. If your workforce already have the devices, you don’t need to shell out money for them. Its very costly for the employer to buy this technology in – and why do if your employees already have it? From an employer point of view, why not use it therefore? A bit more problematic from an employee point of view – particularly if you want some rest after a hard days work! It seems some employees are happy to do this, and ask their employers if they can use their mobile devices for their jobs.

Private and enterprise use of mobile devices are worlds apart – home use focuses on entertainment (e.g. the football scores), while enterprise use focuses on information use and dissemination.

In particular the use of cloud computing needs to be factored in to enterprise use. Applications to support data access using a wide variety of devices is necessary e.g. tablets, smartphones, mobiles. CRM will be in the cloud and needs to be accessible. The cloud must assist the uptake of mobile devices for use in a variety of circumstances. Increase in Network bandwidth will be essential for encouraging use (see notes on the first talk above). The will be an auction for 4G networks in 2012 – this will be an important event in the UK. The cloud can provide support for all areas of business activity e.g. sales.

In terms of working life, it is clear that UK workers are becoming more mobile, which is noteworthy compared with the rest of the world. However this does vary between sectors e.g. the marketing/media sector uses the technology quite a bit compared with retail/transport (I suspect that transport accounts for most of the mobile figures in that rather artificially create sector). Bottom line users are increasing the percentage of time using smartphones for work purposes.

A really big issue here is the multiple operating systems challenge. This was regarded as the biggest challenge to the use of mobile devices in the enterprise space. Cross platform applications are a must if a wide variety of devices is to be supported in the enterprise. The question posed was will HTML 5 be the enabler of this cross-platform compatibility? Flash is dying on its feet. What technology stack will be used to write applications? No one size fits all. It struck me at this point, that the mobile space is very much in the same state as the PC market in the early 1980’s before Microsoft emerged as the winners. Given this I posed the question - who will be the winner this time? Will it be apple with their lead in mobile devices – Google Android or Microsoft again? How will Apple cope without Steve Job’s direction? Steve was quite enthusiastic about Microsoft’s ability to provide the necessary platform, although he felt that the company has missed Bill Gates direction and have made some pretty serious errors since he stepped down. Will Windows 8 deliver the required functionality? Personally I’d prefer an open source solution using common standards, but this isn’t likely to happen for various reasons (open source is driven by nerds, who’s idea of interactivity is the UNIX command line prompt). Its clear that ‘Minority Report’ style interactions are needed for mobile devices. Only time will tell what the outcome will be!

There is evidence of the increase in people’s intention to get tablet computers at some time in the future, particularly (and not surprisingly) the iPad. Although more interest in smartphones was expressed e.g. 51% would be more likely to buy this type of mobile device. Problem is that 4/7 applications on this platform are considered to be malware. This has to be dealt with if these devices are to be used for the benefit of the enterprise. Steve talked briefly about the connected car, and his vision for the car as a mobile device, getting travel information, congestion coping, assisting the drive with better more economical styles of driving (erm, not for me – I like to be in control of the car!).

Acknowledgements

Many thanks for Martin Ballard of the Mobile data association and the ICT KTN for organizing the event.

I attended the Enterprise search meetup on Tuesday 4th October 2011, and here are some brief notes from the event. We had two talks and a 'fishbowl' discussion. I'll concentrate on the talks, and bring in any issues from the 'fishbowl' when it is necessary. The talks were from Iain Fletcher and Charlie Hull.

1. Iain Fletcher - Search Technologies. Data Quality, the Missing Ingredient for Enterprise Search

The essential problem introduced in the talk was the poor data leads to poor search and dissatisfaction on the part of the users in the service they get from the deployed system. Problems increase with time, search administrators need to constantly update their systems to cope with growth in data. Failure to tackle these problems leads to increasing users dissatisfaction - evidence shows that half the time or less, users do not get what they want. Relevance ranking has come along way over the years, however many ranking algorithms need to be tuned to the collection they retrieve on, optimizing on constants such as B and k1 in the BM25 matching function. I've done some of this type of work myself, and had some success at the Web Track @ TREC.

A possible solutions is auto-personalisation, however Iain suggested that this may not work well, and there is evidence for this in the academic literature.

Iain stated the search engines tend to rely on meta-data, and he gave the example of Google who rely on well written pages from which meta-data can be extracted. Thus classification using the meta-data extracted can narrow down the search for the user providing the user with some ability to improve their search result (as always with search, this must depend on the user's ASK). When writing web pages to be retrieved it is best to remove non-relevant text from page to increase the chances of the page being retrieved - this is the process of 'cleaning' the data. An example would be to have a bio of an author on every page of their website - this would impact on search negatively. A process of normalization can be used to ensure that relevant text is put together on the same page, increasing the chance of better search results.

Iain then talked about complexity management, which can be a real problem in search. He advocated the use of TQM (Total Quality Management) for search, using a black box method to find problems. Optimizing on one variable is problematic, as one does not know the effect of doing this on other variables - a holistic approach needs to be done if this is going to work in any sensible way. I myself used a brute force approach to optimizing tuning constants on the BM25 matching function, but you could think of using machine learning to do this - Microsoft have used Gradient Descent techniques for this kind of work (I can dig up the reference on request).

Iain concluded with a number of suggestions as follows:

Data needs be thought about properly. Focus needs to be on the data, rather than the search engine.
A formal model of data is required, and a data model design is needed. In the discusion later, it appears that in my circumstances no formal document is available to provide this information or the requirements for search. Transparency is a very important factor.
A process to keep search working is essential, and adapt to changes in the data, as it grows with time. Otherwise the search will break!

2. Charlie Hull - Flax Search. Just the Job - Employing Solr for Recruitment Search

Charlie gave an interesting talk on the practical application of search technologies to a real world case study, in this case Reed Recruitment. Reed recruitment has significant data problems with 3 million job seekers in their database, and around 300 end users dispersed throughout 350 offices the UK. Their search before the new system was implemented on a transactional system using Oracle, the relational database system.

To say the oracle search was clunky would be something of an understatement. The user had 20/30 fields to choose from, and had to wait a significant length of time for the results as 100's of millions of database records were processed. Data was held on salaries etc as well as unstructured information such as CV's and job specifications. Oracle is fine for data, but very poor for unstructured data IMHO.

In order to create the new search, data had to be extracted from Oracle and transformed to a format which could be used by a search engine such as Solr. Based on XML two processes are defined:

Indexer: extract and process the data from Oracle.
Config: builds and verifies the data for the search engine.

Charlie describe the process using a diagram, which I don't have but was illuminating and helped understanding (I won't try and replicate it here, my drawing skills are rubbish!). Reed did the interface part of the projects, as they know their users well.

Overall I found this a very useful case study of applying open source software to real world problems. Later on in the discussion, there was an interesting interaction on using open source vs. propriety software. Largely this is due to policy according to Iain Fletcher, which invariably means Microsoft. I was reminded of the old adage "nobody ever go sacked for buying IBM", these days its "nobody ever go sacked for buying Microsoft"!

The search is now live and working well - Reed are satisfied with it. On interesting fact that emerged was that there is considerable resistance from users who have got used to using the old system. This is normal, and reminds me that the only reason Dialog is around is because information scientists using it demand access to a command line interface (power users who want to retain control of their world, and prevent disintermediation). These problems to not appear to occur with new members of staff, not yet initiated into the ways of the old system.

Unix Spiders

Sunday 30 October 2011

Future Signals – where next for mobile data?

Tuesday 25 October 2011

The Lighthill Debate

Monday 24 October 2011

Monday Blues Cure

Monday 17 October 2011

Monday Blues Cure

Monday 10 October 2011

Monday Blues Cure

Sunday 9 October 2011

Enterprise Search

Monday 3 October 2011

Monday Blues Cure

Twitter

About Me

Blog Archive

Labels