Web Content Analysis of Online Grocery Shopping Web Sites in India

2018 ◽  
Vol 5 (4) ◽  
pp. 61-73
Author(s):  
Tanushri Banerjee ◽  
Arindam Banerjee

This article evaluates online grocery shopping web sites catering to customers primarily in India. The process of evaluation has been carried out in 3 parts using Rapidminer. In part A, the authors have studied the similarity in content that resides on the grocery shopping web sites. Using unstructured data from homepage of grocery shopping websites and the keywords specified for the web sites, the authors have made an effort to establish a cosine similarity index amongst them. In part B, the authors have analysed the customer reviews from the web sites. Studying the resulting association rules, authors have attempted to identify the attributes that drive customer happiness. In part C, the authors have documented the web traffic metric parameters (attributes) measured by search engine optimization (SEO) tool web sites. Hence, the created a correlation matrix to determine the parameters that are significantly impacting per day revenue for the web sites.

Author(s):  
Tanushri Banerjee ◽  
Arindam Banerjee

This article evaluates online grocery shopping web sites catering to customers primarily in India. The process of evaluation has been carried out in 3 parts using Rapidminer. In part A, the authors have studied the similarity in content that resides on the grocery shopping web sites. Using unstructured data from homepage of grocery shopping websites and the keywords specified for the web sites, the authors have made an effort to establish a cosine similarity index amongst them. In part B, the authors have analysed the customer reviews from the web sites. Studying the resulting association rules, authors have attempted to identify the attributes that drive customer happiness. In part C, the authors have documented the web traffic metric parameters (attributes) measured by search engine optimization (SEO) tool web sites. Hence, the created a correlation matrix to determine the parameters that are significantly impacting per day revenue for the web sites.


Author(s):  
Shaoyi He

The World Wide Web (the Web), a distributed hypermedia information system that provides global access to the Internet, has been most widely used for exchanging information, providing services, and doing business across national boundaries. It is difficult to find out exactly when the first multilingual Web site was up and running on the Internet, but as early as January 1, 1993, EuroNews, the first multilingual Web site in Europe, was launched to simultaneously cover world news from a European perspective in seven languages: English, French, German, Italian, Portuguese, Russian, and Spanish. (EuroNews, 2005). In North America, Web site multilinguality has become an important aspect of electronic commerce (e-commerce) as more and more Fortune 500 companies rely on the Internet and the Web to reach out to millions of customers and clients. Having a successful multilingual Web site goes beyond just translating the original Web content into different languages for different locales. Besides the language issue, there are other important issues involved in Web site multilinguality: culture, technology, content, design, accessibility, usability, and management (Bingi, Mir, & Khamalah, 2000; Dempsey, 1999; Hillier, 2003; Lindenberg, 2003; MacLeod, 2000). This article will briefly address the issues related to: (1) language that is one of the many elements conforming culture, (2) culture that greatly affects the functionality and communication of multilingual Web sites, and (3) technology that enables the multilingual support of e-commerce Web sites, focusing on the challenges and strategies of Web site multilinguality in global e-commerce.


2011 ◽  
Vol 52-54 ◽  
pp. 25-30
Author(s):  
Bo Wang ◽  
Yong Wei Wu ◽  
Wei Min Zheng

Web server aims to service clients sensitively and clients wish to explore web sites at fast bandwidth. Nevertheless, sometimes users can only get the web content at a slow response due to the slow communication. Using data buffering technique, web cache provides clients an alternative way to acquire the web content from web server at low cost and high bandwidth. Using web cache technique, users can get fast response at low communication cost. When quantities of data are buffered to the cache server, cache policy becomes an important factor which can clearly affect the performance efficiency. Our contribution is that we take full account of the user's visiting action and redesign the cache policy. Based on this, we compare our policy (UVA) with existing method (LFU) through series of experiments. The results show that our method can efficiently improve cache hit rate.


2011 ◽  
pp. 1187-1194
Author(s):  
Shaoyi He

The World Wide Web (the Web), a distributed hypermedia information system that provides global access to the Internet, has been most widely used for exchanging information, providing services, and doing business across national boundaries. It is difficult to find out exactly when the first multilingual Web site was up and running on the Internet, but as early as January 1, 1993, EuroNews, the first multilingual Web site in Europe, was launched to simultaneously cover world news from a European perspective in seven languages: English, French, German, Italian, Portuguese, Russian, and Spanish. (EuroNews, 2005). In North America, Web site multilinguality has become an important aspect of electronic commerce (e-commerce) as more and more Fortune 500 companies rely on the Internet and the Web to reach out to millions of customers and clients. Having a successful multilingual Web site goes beyond just translating the original Web content into different languages for different locales. Besides the language issue, there are other important issues involved in Web site multilinguality: culture, technology, content, design, accessibility, usability, and management (Bingi, Mir, & Khamalah, 2000; Dempsey, 1999; Hillier, 2003; Lindenberg, 2003; MacLeod, 2000). This article will briefly address the issues related to: (1) language that is one of the many elements conforming culture, (2) culture that greatly affects the functionality and communication of multilingual Web sites, and (3) technology that enables the multilingual support of e-commerce Web sites, focusing on the challenges and strategies of Web site multilinguality in global e-commerce.


Author(s):  
Kai-Hsiang Yang

This chapter will address the issues of Uniform Resource Locator (URL) correction techniques in proxy servers. The proxy servers are more and more important in the World Wide Web (WWW), and they provide Web page caches for browsing the Web pages quickly, and also reduce unnecessary network traffic. Traditional proxy servers use the URL to identify their cache, and it is a cache-miss when the request URL is non-existent in its caches. However, for general users, there must be some regularity and scope in browsing the Web. It would be very convenient for users when they do not need to enter the whole long URL, or if they still could see the Web content even though they forgot some part of the URL, especially for those personal favorite Web sites. We will introduce one URL correction mechanism into the personal proxy server to achieve this goal.


Author(s):  
Shuk Ying Ho

Hundreds of thousands of companies worldwide are using the Web as a major channel to interact with their customers for brand promotion, product marketing, order fulfillment, and after-sales support. Competition is extremely keen among online merchants.1 In doing business online, the question that lurks in the back of their mind is, are we maximizing our business opportunities? With the high interactivity of e-commerce, online merchants now adopt various differentiating strategies to attract and retain customers in the hope of remaining competitive. To provide a differentiated service, online merchants first identify each individual, and then acquire more information about each individual’s interests. Then, they can tailor Web content directly to a specific user by having the user provide information to the Web site either directly or through tracking devices on the site. The software can then modify the content to the needs of the user. Ultimately, highly focused and relevant products or services are delivered to each customer, who is treated in a unique way to fit marketing and advertising with his or her needs. This process is generally named personalization. There is a wide range of personalization strategies used nowadays. For instance, My Yahoo! provides a personalized “space” for each user. It automatically generates personalized content (e.g., information on the horoscope for the correct star sign) matched with users’ profiles (e.g., a person’s date of birth). Apart from automatic personalization, it also presents the users with an array of choices and allows the users to select what is of interest to them. The users can personalize not only the content (e.g., weather, finance) but also the layout (e.g., color, background). My Yahoo! was considered to be one of the forerunners among the growing number of personalized Web sites that have been springing up on the Internet over the last few years (Manber, Patel, & Robison, 2000). Amazon.com greets returning customers with a personalized message and offers a hyperlink to book recommendations congruent with their past purchases. These recommendations are generated based on the customers’ previous purchases and the preferences of like-minded people, and there is no extra work imposed on the customers. Amazon continues to establish its personalization system, and more filtering mechanisms are being added to make the book recommendations be more relevant and useful. Recently, there has been the introduction of a personalized search engine, A9.com by Amazon.com, which recommends relevant Web sites to each individual by analyzing his or her browsing history and bookmarks. Expedia.com asks users for their desired destinations and then e-mails them information about special discounts to the place where they like to travel. It is expected that corporate investment in personalization technologies will continue to surge in the future (Awad & Krishnan, 2006; Poulin, Montreuil, & Martel, 2006; Rust & Lemon, 2001). Given the proliferation of personalization, this chapter will address the key issues related to personalization and provide definitions to some keywords, such as rule-based personalization and collaborative filtering.


Semantic web is not just a matter of translation from HTML to RDF/OWL languages. It is a matter of understanding the content of the web through knowledge graphs. Entities need to be related with relationships. This content is composed of resources (web pages) that contain, for example, text, images and audio. Thus, there is the need of extracting entities from these resources. Currently, most of the web content is in HTML5 format which is a W3C recommendation which enables describing the structure marginally with the help of annotations. The main challenge here is to transform unstructured data from plain HTML files to structured data (e.g RDF or OWL). The current work provides the first hand information for dealing with unstructured heterogeneous data residing on web using Twinkle, a Java tool for SPARQL query execution on FOAF (Friend Of A Friend) document.


Author(s):  
Kimman Lui ◽  
Keith C.C. Chan ◽  
Kai-Pan Mark

Social software completely revolutionizes the way of information sharing by allowing every individual to read, share and publish online. In terms of marketing, it is an effective way to understand consumers’ perceptions and beliefs in different local regions by analyzing and comparing the web content regarding a specific product retrieved on the Internet with respect to different locations. Interestingly, incidents originated from a location may attract more Internet discussions by individuals from remote locations. Therefore, it is difficult to measure the strength of people’s perceptions between different locations if we solely rely on the web traffic statistics. Moreover, it is difficult to compare strength of perceptions retrieved by different search engines, at different times, and on different topics. To overcome these inadequacies, the authors introduce a quantitative metric, Perceived Index on Information (PI), to measure the strength of web content over different search engines, different time intervals, and different topics with respect to geographical locations. Further visualizing PI in maps provides an instant and low-cost mean for word-of-mouth analysis that brings competitive advantages in business marketing.


Author(s):  
Miguel A. Morales-Arroyo ◽  
Foo Chee Yuan ◽  
Lim Thian Muar ◽  
Kwek Choon Hwee

With the large amount of information available in the WWW, the ability to distinguish relevant from irrelevant data becomes a crucial factor. In this project, eight web scraping spiders were configured and evaluated for their functionality in order to determine their suitability for Interactive Digital media (IDM) start-ups to be utilized for competitive intelligence gathering. These spiders were chosen from the internet because of their availability and low cost. Each spider was configured and tested on two web sites. The evaluation process was first carried out individually to give a score to the spiders and then as a team to moderate the scores. The Web Info Extractor has the highest overall score as a web scraping spider while the Web Content Extractor has the best task analysis result. After the evaluation process, it is concluded that different spiders have varying capabilities and thus are suitable for different tasks. A spider that can handle more complex tasks is usually inherently more complex to configure and less-user friendly. Hence, in order to select the correct spider, companies should understand the tasks undertaken by their customers through basic task analysis as well as the knowledge of the amount of resources that they have at their disposal when it comes to configuring and operating the spiders.


Author(s):  
Peter O’Connor

Since its launch in 1994, the Web has continued to grow at a phenomenal rate, from an estimated one billion Web documents in 2001 to over eleven billion in January 2005 (Gulli & Signorini, 2005). For most users, navigating this ever expanding sea of data has become a significant challenge. Search engines – sites that maintain indexes of Web content – allow users to specify words / phrases and return a list of sites that potentially match these criteria – have become a key way of finding information on the Web. Google, the world’s largest search engine is estimated to have indexed over 8 billion pages (Sullivan, 2004c) and to be used in nearly 50% of consumer searches (Nielsen NetRatings, 2006). Over 6.4 billion individual searches took place during May 2006 within the USA alone (Comscore, 2006). Clearly being favourably positioned in such search results is very important for site owners wishing to get visibility to the online consumer. In the beginning, search engines were unbiased, striving to display the results that provided the most relevant answers to user queries (Sullivan, 2002). While many were supported by advertising, in general this took the form of banner advertisements – graphical adverts displayed across the top of the page and clearly differentiated from the engine’s search result listings. Today, however, search engines need more workable business models to meet the substantial costs of maintaining their databases and improving their technology (Princeton Research Associates, 2002). For that reason, many now market their ability to route consumers towards specific Web sites - blurring the line between their ‘results’ and their ‘advertisements’. Like tour guides supplementing their income by bringing potential customers to restaurants or gift shops, many search engines now actively direct users to sites which have paid for positioning on their results pages (Lastowka, 2002). In most cases, these paid placements are based on advertisers bidding for the specific keywords under which they wish to be displayed. For example, an online sports retailer might wish to appear when ever users enter “running shoes” as a search criterion. Bidding on these keywords would insure that the sports retailer’s site is displayed prominently in the resulting search listing. Controversy has arisen, however, over the use of trademarked terms as keyword triggers. For example, can the same online retailer bid on the keyword “Nike” and / or use the trademark “Nike” in the copy of the advert subsequently displayed, thus potentially diverting shoppers who might otherwise have bought directly from Nike.com - the Web site of the trademark owner? Clearly doing so compromises Nike’s brand equity – its monopolistic right to be able to profit from its investment in building up the Nike brand (Arvidsson 2006). As George (2006, p 215) puts it “brands are the placards by which modern consumers choose their products”. Corporations rely on brands to stimulate consumer awareness and foster an affinity for their products (Spinello 2006). Legal protection against brand infringement comes from trademark law – a subsection of intellectual property law that prevents third parties from benefiting from the value and goodwill built up in a brand (Gallafent 2006). However, such legislation has developed in the off-line world. How do its principles and practices transfer to e-commerce? While still a developing subject, this paper examines the ethical and legal position surrounding trademark infringement in a specific area of the electronic arena – within paid search advertising. The paper explains the rational behind the problem, outlines the current legal situation and offers advice as to how trade name owners can better protect their e-brand.


Sign in / Sign up

Export Citation Format

Share Document