Data mining applications in E-Commerce

  • CategoryBusiness

  • View406

  • Himanshu Ramchandani Arizona State University, 2015 Data Mining Applications in E-commerce Success of an ecommerce site lies in the number of buyers or amount sold within a period of time. Since the huge amount of data like purchasing record, browsing history of customers are gained on those e-commerce website every day. They know their customers more than everybody else, even the customer themselves. In the past, such huge amount of data was difficult to process and analyze using traditional database and software techniques. Yet now, with the rapid development within both storage server and processing technology, people are able to drive more and more values from data mining and applying them to their business. Not only does it allow merchants to gain deeper insights into customer behavior and industry trends, but it also lets them make more accurate decisions to improve just about every aspect of the business, from marketing and advertising, to customer service, to pricing, to supply chain management,  Marketing One of the most widely use of data mining in e-commerce is customer relationship management (CRM). By building an effective CRM system, CRM can get more efficiency in acquiring new customers, increasing value of existing customers and retaining good customers. For e-commerce companies, using data mining to do marketing strategies and acquiring new customers is largely different from traditional ways like customer survey, advertisement or how shop-keepers kept customer satisfaction. It shows a trend that the relationship became impersonal. In the past display advertising was based on the analysis of marketing experts with model like 4P, 4C. Now due to big data and programmatic buying online retailers can target their advertisements precise to their customers. 1 Look-Alike Models now are widely used to find customer segments and identify potential customers. With the increasing trust of the online market websites, customers are open to expose more and more personal information to the websites. Especially their account information, purchasing history and other shopping preferences are of much more value than other online behaviors that other institutions can get by placing cookies or tracking pixels. With these effective information, online sellers can get the customer segments by clustering. Clustering is an automated grouping of related records together. Records having similar values for the attributes are grouped together. Here, the customer subdivides are the foundation of enterprise's selling effectively, marketing, and service. It divides whole customer group into different kind. Customer in each one has similar attributes, and the customer in not similar has different attributes.2 When they knows who are their customers, what they want, where they are, and even how they want to be contacted and when, they are able to place their advertisements and marketing activities to the right person in the right place and at the right time. Then it won’t be a hard work to make their potential customers into real buyers.
  • Himanshu Ramchandani Arizona State University, 2015 Amazon, as the largest Internet-based retailer in the United States, has an unrivalled bank of data on online consumer purchasing behavior that it can mine from its 152 million customer accounts. Only within its application of marketing, its ad technology has help many retailers become succeed. And its seller service, which includes big data marketing, has become a part of Amazon’s revenue. It brought in about $500 million of Amazon’s $48 billion in revenue in 2011, Baird & Co. senior research analyst Colin Sebastian estimates. Although it’s still a relatively small and low-key business for Amazon but it is also widely accepted that using its data to expand its advertising business could open up new fronts of competition with Google. Now it has developed an in-house platform for targeting ads to people who have visited and then left Amazon’s sites, making it likely that the company will open up these advertising services more widely over the next year. 3  Recommendation Engines Recommendation Engines are the analytic power houses that are backbone to any ecommerce company. Unlike traditional retail store, e-commerce companies do not get face to face interaction with the customers to understand their needs and moreover due to the nature of the business these companies have massive list of products to sell. Recommendations engines help connect customer needs with product offerings to suggest customers what to buy next. Recommendation engines need to consider both the quantitative data (purchase history etc) and qualitative data (customer behavior). Hence e-commerce industries rely on recommendation engines from all cross-selling. “Netflix” and “Amazon” are two companies that come to mind when one thinks about commercial impact of recommendation engine. Its not just about recommending what to buy/watch next, it is all about creating a personalized experience for each customer. If looked at above mentioned companies, they did not just use readily available cookie cutter solutions for mining the customer data. These companies created and continuously evolved their home grown data mining and analytics applications to make recommendations. This makes perfect logical sense as each company interacts differently with their customers to address different customer needs and recommendation engines are all about creating personalized algorithms to meet these needs. The following is brief note about Netflix’s recommendation engine as stated in wired magazine “Trying to understand the invisible array of algorithms that power your Netflix suggestions has long been a favorite sport, but what’s actually going on in that galaxy of big data, those billions and billions of ratings stars? Turns out there are 800 Netflix engineers working behind the scenes and the company estimates that 75 percent of viewer activity is driven by recommendation.” Recommendation engine factors in multiple dimensions for performing analytics. For example Netflix not only tracks what one watched but also tracks what one is browsing, how much time one is spending, where one is browsing and when one is browsing. It is vital to analyze data from all these dimensions to give right recommendations. When Amazon recommends a product a customer might want to buy, it is by no chance a coincidence. Amazon had revolutionized e-commerce industry with its state of art recommendation engine that factors in various past history, current trends and preferences. To a great extent Amazon’s success can be attributed its home grown recommendation engine as
  • Himanshu Ramchandani Arizona State University, 2015 Amazon has integrated recommendations into nearly every part of purchasing processing from product discovery to check-out. From a company’s perspective, recommendation engines are tools for targeted advertising/selling. Recommendation engines are disrupting customer purchase lifecycles and shortening the purchase timelines. Many companies spend a significant portion of their marketing spend on advertising be it TV, email, hoardings or other media in order to reach out to target customer base. E-commerce companies have an advantage to target a product to the right person using recommendation engines. So these engines are not just cross selling tools but also a proven way to advertise products to the right customer base. This might seem very important during new product launches. It is a proven marketing fact that customers are getting less brand preferential and in this day and age if a brand needs to stay connected to the customer, recommendation engines through e-commerce channels are an important medium to consider. Recommendation engines are not one time installations. There is a need to constantly evolve them over time based on the learning gathered through the passage of them. Netflix and Amazon have constantly evolved their recommendation engine and improved them significantly over time. Companies learn about other important dimensions about their customers that can be factored into the analysis. Recommendation engine needs to have strong integration with the inventory management system. Recommendation engine looks at what a customer would buy and goes through the massive list of inventory to recommend the right product that is readily available for the customer to buy. Availability of the product is very critical before recommending the product. Above examples focused on how important recommendation engines are to e-commerce companies. Finally, it very important for recommendation engines to strike a right balance to make sure that companies do not over analyze, so that at the end of the day a customer might still need flexibility to easily choose/select/buy anything beyond what the recommendation engine perceives right for the customer.  Pricing One application of big data analysis used commonly in online markets is in pricing analytics. Pricing analytics is the combination of methods and strategies used to optimize prices in order to maximize profits while covering costs and appealing to customers. Historically, pricing is a realm of business intelligence that offers many different solutions to the same problem and it’s rare to find a company that is completely satisfied with their pricing structure. Corporations have, for a long time, struggled with an adequate way to predict how their customer base will respond to an increase or a decrease in price; however, thanks to the onset of big data and the analytic opportunities it offers, companies are now innovating in ways never thought possible. Zilliant is one of those companies creating innovations in the pricing industry using data analytics. They offer two products, MarginMax and SalesMax. Ben Kepes of reports that “MarginMax allows organizations to predict the bottom line impact of pricing strategies, while SalesMax uncovers opportunities that hide within companies’ customer bases.”7 Drawing on large resources of past data, Zilliant is able to accurately predict the profits and losses of different pricing strategies while preventing customer churn. These types of models are only possible because of Zilliant’s investment into big data and the opportunities it offers.
  • Himanshu Ramchandani Arizona State University, 2015 E-commerce companies are at the forefront of new and innovative pricing strategies. These companies have the advantage in the ability to instantaneously evaluate their prices according to most recent stock, competitor’s prices, and the past history of their customers. Dynamic pricing is a pricing strategy employed by many companies who are looking to find the maximum amount of money customers are willing to pay for a product or service. For example, Uber’s pricing model is built upon a base price which is multiplied by a surge rate that depends on demand for that area, time of day, and a number of other factors. An interesting thing about how Uber implements their pricing strategy is that they determine surge pricing in order to both meet demand and increase supply. Uber employs independent contractors who can choose whether or not to pick up certain passengers in certain areas. Using a proprietary algorithm, Uber can determine the optimal fare that will meet demand and attract more drivers to the area to increase supply. “What this means is that in the case of Uber, surge pricing doesn’t just make rides more expensive… It also expands the number of people who are actually able to get a ride. Customers pay more, but they also get a ride that they otherwise would not have gotten. This is exactly how a market is supposed to work: higher demand induces more supply.”8 An Uber representative describes their algorithm in this way: “If drivers wait two additional minutes to find a client but pickup times decrease by four minutes, then this is a net efficiency gain. We increase the wait time between requests until no gain can be made.”9 Uber and Zilliant are only two companies making strides in predictive analytics for pricing opportunities. Amazon has been tracking customer data such as past purchases, items customers give a favorable rating to, items in their online shopping carts and similar purchases to predict prices and make product recommendations. In this new era of predictive analytics using big data, Ecommerce sites factor in many variables including available inventory, customer preferences and historical product pricing. It is not hard to imagine a near future where we will see every ecommerce site using some form of predictive analytics to determine their pricing model. The algorithms used by industry leaders are getting more and more efficient and businesses are willing to invest significant resources into reinventing their pricing models to match what these businesses are doing.  Supply Chain Management According to the Council of Supply Chain Management Professionals (CSCMP), supply chain management (SCM) encompasses the planning and management of all activities involved in sourcing, procurement, conversion, and logistics management. It is an approach which involves managing the demand from the customers by effectively managing the internal system of supply. SCM is the backbone of E-commerce industry as having the product at the right time, at the right location and delivering it at the right time is the basic function of SCM in e-commerce business. It makes the business more efficient and keep it growing even with increasing competition. So, supply chain management has become more of an analytical process. Supply chain analytics and management has now taken acute turn in terms of supply and demand. Few aspects of SCM where analytics is used and is successful in meeting any company’s aspirations are pricing, warehousing, delivery and purchasing. Increasing complexity among the
  • Himanshu Ramchandani Arizona State University, 2015 products due to its variation in its attributes, heightened expectation of customer for better quality of goods and services, unpredictable demands (which is mainly seasonal) and new innovations in every segment of market. Data analytics has helped the businesses to make the SCM process more effective and ability to provide end-to end visibility throughout the supply chain process. Pricing system has become more efficient throughout the e-commerce business by the use of real-time analytics. Customers doesn’t care about the model or technology used to determine the price of a product on the website, they are looking for the cheapest price available. Although sometimes, the customer gives some weightage to the company’s loyalty and brand name while shopping, but only if he/she has been in relationship with the company happily for a long time. But that’s not the case with a new or an average customer. Analytics has helped e-retailers to build an intelligent warehouses where products can be stored in a manner which takes less time, as it took 5-10 years back, to process the order, pack it as per category, and deliver it to the customer or courier vendors. Using the historical data, warehouses have started managing the inventory which has helped them to stock products in large quantity at high-demand locations and selected products as per seasonality demand. In a study in November 2013, Capegemini 10mentioned that around 89% of consumers stated that they would shop a different retailer in the future if their order arrived later than expected, and 73% reported to buy from different retailer if the product is not available or is not in the stock. This behavior of customer was not new as this was the case with retail businesses. But, companies such as Amazon made the best use of information technology and pioneered the system of same day delivery making warehouses a staging area for the products to be delivered directly to the customers. Customers always leave a digital footprint whenever they shop online. The streaming analytics begins here. E-retailers analyze the customers’ needs and categories of products they are looking frequently. Whenever they visit an e-commerce website, customers always see a recommendation of products similar to ones they are searching for. This recommendation of products is driven broadly by two things, viz. availability of the product at warehouse and the price category in which customer is looking the product, also there are other factors driving this recommendation engine. This helps the customer to make better decisions and see all the options available on the plate. Some products have good market response throughout the year, but they shoot up and attain peaks in some seasons. Demand of products such as umbrella, school supplies, bikes shoots up in different seasons like rainy season or when schools and college start a new term. The historical data of these purchases, as per demographic regions, is useful while predicting the demand for future and be ready with the supply to deliver in right time. In 2004, when the residents of Florida were threatened by the news of the hurricane Frances which was on its way to hit Atlantic coast, Walmart saw this an opportunity to use data-driven approach and predictive analytics to predict the demand of its customers11. They used the customer’s purchasing history just before the hurricane Charley struck Florida earlier week and came up with products which were highest selling at that time and started stocking those items in their store in large quantities. Beer was the top selling products before the hurricanes. In the same way, if we meld historical data of the customers and the streaming data in real-time, we will improve the forecasting and demand planning and will make the production operations more efficient. Sometimes the products are left unsold and perishable items have a certain life which limit their usage. Companies incur losses if the inventory of these products is not managed properly. SCM and data analytics helps business
  • Himanshu Ramchandani Arizona State University, 2015 to stock sufficient quantity of perishable items as per the projected demand round the year and as per seasons. At the end of the mind-boggling day, it’s all about to keep your customer stay with you and stay happy, by anticipating their needs and consistently delivering the experience they want. With the use of real time and click-stream analytics, we will be always ahead in Supply Chain Management. Churning of online customers is not a new thing. If any given day, the customer is not happy with the service, it takes 1 click for them to hop to other e-commerce website and create a new digital footprint.  Fraud Detection According to a recent report by CyberSource, fraudulent transaction costs online retailers approximately $3.5 billion or 0.9% of their total sales (CyberSource, 2014-15)12. This amount which ideally should be a part of the Company’s profit ends up becoming a cost to the company. One of the biggest data fraud in the history of America was Target data breach where confidential credit and debit card information as well as personal information of millions of its customers’ were compromised. The company suffered a loss of approximately $148 million (Forbes)13. It led to 46% drop in Target’s profit when compared to the previous year. Along with this, loss of trust from millions of loyal customer was a major blow to the company. Fraud in e-commerce can have such massive effect both financially as well as losing out on customers. The goal of an e-commerce retailer is that of zero fraud which was earlier a distant dream now seems within reach due to rise of Data Analytics in the recent years. The fraud rate in e-commerce is different for different channels highest being web store followed by telephone or mail order. Types of Frauds The key types of fraud that impacts the e-commerce retailers are credit card fraud, identity fraud and return fraud. Credit card fraud is one of the most prevalent fraud that affects the online retailer also called as the “Card Not Present (CNP)” interaction. The fraudster takes advantage of the fact that the e-retailer cannot see the card physically. Most companies use the methods provided by Credit Card companies like Address Verification Service (AVS), Card Verification Number (CVN) etc. to minimize fraud. However, these methods are not foolproof as credit card fraud still remains a concern for the online retailers. Identity fraud, on the other hand involves stealing a Customer’s identity and personal information. The fraudster logs into the company’s site using the stolen data, shops online and ships to a different location. This type of fraud affect the companies heavily as it results into huge chargebacks. The next type of fraud, return fraud can be done in many ways like returning the merchandize after using it, claiming the product wasn’t delivered after accepting it and then selling it through a different source/medium. Other frauds types include Triangulation scheme, phising/ pharming/ whaling, Botnets, etc.
  • Himanshu Ramchandani Arizona State University, 2015 Preventing Fraud using Analytics Analytics can help in fraud detection by discovering non-obvious relationships, visualization techniques to identify fraud patterns and machine learning to prevent reoccurrence of attacks. Previously retailing companies used only a subset of the data for fraud analysis as using the entire available data set was both time consuming and expensive. This is now possible with the advent of the analytics tools. All this data can be aggregated and analyzed using analytics tools like Hadoop. Analyzing the complete data set has several benefits: (a.) Screening all the transactions on the basis of pre-defined fraud rules and models to detect fraud. (b.) Identify new fraud patterns and business rules and add it to the fraud rules. (c.) Minimize false positives to reduce the cost of fraud and turn away legitimate customers. For instance, return fraud can be minimized using analytics by determining if the merchandize was actually delivered using data from social network, conducting image analyses, etc. Fraud can be detected in real time using automated screening of the transactions and combining data from other sources like Google Maps lookups, social networking sites, etc. This can prevent fraud like credit card fraud by using various tools like Address Verification Services (AVS) and Card Verification Number (CVN) and combining it with data from other sources like customer’s social feeds, web logs, and geo data from customer’s mobile app. The fraudulent transaction can be declined since this is done in real time. This saves the online retailers chargeback amounts and loyal customers. Real time fraud detection is also used in return fraud by analyzing the location data from sensors attached to high value goods. This would enable the retailers to know where exactly when and where the good was delivered to the customer. Online retailers can also use the analytics tool’s capability to analyze visual data from various source to detect fraud. These tools can be used to determine geographical region, customers and merchandize that have higher fraud rate based on analysis of historical data. New types of fraud will keep emerging with the rise of online shopping. Regardless of which new “normal” pattern emerges, analytics solutions will keep reaffirming that good can always overcome the evil, as long as the good have analytics by their side.
  • Himanshu Ramchandani Arizona State University, 2015 References 1. Sadath, L. (n.d.). Data Mining in E-Commerce: A CRM Platform. International Journal of Computer Applications(0975 – 8887), 68(No.24, April 2013), 32–37. 2. Feng, D. Zaimei. Z. Fang. Z. Jianheng. J. (n.d.). Application Study of Data Mining on Customer Relationship Management in E-commerce. 3. Leber, J. (2013). Amazon Woos Advertisers with What It Knows about Consumers. MIT Technology Review. Retrieved from: woos-advertisers-with-what-it-knows-about-consumers/ 4. build-one 5. 6. 7. Ben Kepes. Forbes. 11/15/2013. pricing-fit/ 8. James Surowiecki. TechnologyReview. 8/19/2014. 9. 10. Capegemini :The Supply Chain Impact Survey Research Results access/resource/pdf/capgemini_scm_and_consumer_survey.pdf 11. 12. CyberSource. (2014-15). Online Fraud Management Benchmark Study . San Francisco: CyberSource. s/CYBS-Fraud-Benchmark- Report.pdf?utm_campaign=2015%20Fraud%20Report%20Form%20Auto%20Responder& utm_medium=email&utm_source=Eloqua 13. Frobes Target cost of data breach reveals-cost-of-data-breach/
Himanshu Ramchandani Arizona State University, 2015 Data Mining Applications in E-commerce Success of an ecommerce site lies in the number of buyers or amount sold within…