Collaborative Filtering, Engage & Webmining ... The Internet Store Moves Closer To Reality

While the entire internet might be considered a store, the practical application of this will be that the most effective websites will themselves become interactive stores where each click further defines the product/service interests of the site visitor or "store" customer.

November 1998

In our previous post "The Internet Store" (October 1998) we suggested that the entire internet surf experience might be viewed using the analogy of entering a supermarket store where all transactions can be tracked.

While the entire internet might be considered a store, the practical application of this will be that the most effective websites will themselves become interactive stores where each click further defines the product/service interests of the site visitor or "store" customer.

The promise of greater interactivity on websites is brought closer to reality by the implementation of a new technology called "Collaborative Filtering" being put into use by a number of cutting-edge web merchants.

This development is reported by CACI, one of the world's leading suppliers of demographic marketing information ( In an article titled "Collaborative Filtering Paves the Way for On-Line Merchants" from their November 3, 1998 CACI Marketing Systems Newsletter CACI notes:

"As more and more web browsers become web purchasers, some top on-line merchants are beginning to utilize a system called collaborative filtering to capture basic demographics, such as buying habits and requests. They then use this data to create a unique store for each customer. For example, if you were to visit an on-line music store and enter an artist, or album title, the site will return that title/artist along with other similar artists. This returned information could be purchases made by other buyers who requested the same title, or similar titles within a specific genre. This process effectively utilizes a seller's ability to recommend additional, similar purchases and encourage the buyer to purchase additional similar products. This is very similar to the sales cycle in the retail environment: you just purchased an expensive leather jacket, and the salesperson/merchant could easily recommend a leather protectant spray with minimal effort. As you can see, this practice can now be true with the on-line world of shoppers. The ease of purchasing on-line also translates to suggestive selling."

Engage And The Personalization Of Web Ads

The collaborative filter technology is part of the move of internet companies to utilize an increasing amount of information about site visitors. As announced in the August 16, 1998 Wall Street Journal, some of the largest commercial sites on the web have agreed to feed information about their customers' reading, shopping and entertainment habits into a system developed by Engage Technologies ( Already, Engage is tracking the moves of more than 30 million internet users. The WSJ notes that the agreement calls for the participating web sites to track their users so that advertisements can be precisely aimed at the most likely prospects for goods and services.

An example of the Engage system in action might be this. An internet user who looks up information about France on a travel site in the network might receive ads for airlines flying into Paris and for hotels in Paris. As the WSJ notes, mailing list companies are limited to identifying people for mailing lists by broad interests such as subscribers to a particular magazine on fishing. However, internet-based systems can find a person who reads articles about fishing even if the web page he is visiting is part of a general news or recreation site.

While Engage centers around delivering personalized ads, what about delivering personalized products? Rather than an ad for the product, why not offer the product? As web pages become more personalized to the individual surfer, the products would be closer and closer to his or her interests and, in effect, requiring less and less the need for advertisements to sell them. In this sense, a surfers trip through a particular web site is like a trip to a number of stores all virtually created based on click activity.

Websites As A Collection Of Spontaneous Virtual Stores

Good sales people always suggest related products based on what you tell them. This is a well-known concept in sales termed "cross-selling." For instance, assume a shopping trip to a department store like Macy's. You go to the clothing section and appear interested in buying a particular pair of pants. A good sales person will suggest another similar product such as a belt you might consider when buying your pants.

Another sales method that might be used by the salesperson is termed "upselling." This is not based on other types of products but rather selling you a more expensive type of product in the category you want, or, more of the particular type of product. For instance, you want a pair of X pants. The sales person may respond by suggesting Y pants which cost more (hopefully also possess a higher margin for the store), or, talking you into buying more of X pants based on a special volume purchase. Imagine for a moment that this sales person was a type of magic wizard that had the ability to transform the entire store into an environment based on what you tell them. For instance, knowing you want pants this virtual pants store wizard sales person might make an incredible pants store "pop" up all around you. In the pants store, might be a number of belts because from previous sales experience the sales person and the store know that the purchase of pants are related to the purchase of belts.

A website visitor can be compared to a customer in a retail store. Your site is the sales person.

The current scenario is that the visitor will visit parts of your store they have an interest in. They may be interested in buying "pants" so to speak and go to a hypertext linked page that says "pants."

When they arrive at these pages, there are hypertext links which lead them to other areas. These are more targeted hypertext links than the home page or the page where they entered because they have made a choice and (hopefully) you have realized this in programming your site.

But what if the hypertext links suggested to them when they go to your various pages have not only been created by you but also by CGI programming suggests other links?

Making It Work: Commissions On Outbound Collaborative Links

You might be a good hearted person who desires to send people to areas based on their needs. But shouldn't you be rewarded for this?

In effect, a sales person in the "pants" section of a department store should make some type of commission for sending someone into another department if they in fact buy something in the other department. Of course this doesn't happen in the world of "real" department stores (and could be one of the major challenges to them in the age of the internet).

But if a site visitor is referred to another site for a product from one of the pages in your site, shouldn't you get some type of commission if this surfer buys the product on the other site?

So far, Amazon has been the major site to realize this with the implementation of their Associates Program. This innovative program gives sites which refer to them a percentage if the person buys. (We term this an "outbound collaborative" link program where one is paid for referrals. This type of arrangement is distinguished from an "inbound collaborative" link program where one pays for referrals.) Currently on Amazon, there are about 400,000 books which Amazon pays a 15% referral commission on and another 1.5 million which they pay a 5% referral on.

The concept of the Associates Program (which Amazon so far "preaches" better than it "practices") needs to be used as a core strategy at the heart of any internet sites that want to make full use of the internet as an interactive sale tool. One affiliate type of program doing well outside of Amazon's books is Bill Lederer's site called Artuframe at ( The site is billed as the Web's largest collection of framed and unframed art. His affiliates program ( helps him sell his collection of art.

In effect, your website should have on it "super" sales people who never sleep and always know where to refer customers who visit your site (store) based on their click information.

Making It Work: Commissions On Inbound Collaborative Links

The outbound commission strategy also works on inbound links, or, those surfers coming into your site to purchase products from other locations. For example, suppose you are a sales person in a department store and get a shopper referred to you from another department. In order for you to encourage the behavior of the other sales person in sending a person to you, a commission (or some type of reward system) goes a long way in reinforcing the propensity for this type of behavior.

The same principle holds true with the internet. Sites which send customers/prospects to you need to also be rewarded via a type of reward system. In effect, you become a type of Amazon here providing commission incentives on all products sold on your site to the referral sites.

Making It Work: Focus On Commissions By Rating Inbound & Outbound Sites

All inbound and outbound links are not equal. And, methods of judging their value go far beyond mere "hits." But this seems to be a difficult concept for many web entrepreneurs to understand and apply. Many websites now actively measure incoming page "hits" which can be segmented by the number of hits coming from various sites. But, beyond this, the sales volume, value per customer and ultimately profit are the real factors which most deserve focus. For example, take referrals to Site C inbound from Site A in one month of 1,000 and from Site B of 2,000. On first glance based just on "hits," Site B looks like the key site. But suppose that average sales from Site A are $20 and those from Site B are $5. In this situation, overall sales for A is $20,000 and for B is $10,000. Value per customer is in Site A with its $20 average purchase per visitor.

Extend the math here to profit margins. Suppose that overall products purchased by Site B customers have a 70% profit margin while products purchased by Site A customers have a 20% profit margin. Then, sales average of Site A is greater, overall profit from Site A is only $4,000 while profit from Site B is $7,000.

From Hits Sales Average Margin Profit
A 1,000 $20,000 $20 20% $4,000
B 2,000 $10,000 $5 70% $7,000

Inbound Hit Value To Site C From Sites A & B

This information is important not only in determining commissions on inbound sites but also in the value for any banner ads that might be placed on these sites. Should one pay for banner ads based on old advertising concepts like CPM (cost per thousand viewers of an ad) or on something more similar to QPM (quality per thousand viewers of an ad)? If this criteria was utilized, which of the sites above should be able to demand the most for a banner ad? The one bringing in the greatest sales per visitor (A) or the greatest profit (B) and greatest number?

The same type of reasoning also applies to rating the value of outbound sites. Your internet store should be paid commissions on "outbound links" based on similar criteria used for "inbound links." Again, your "worth" to other sites should be based on the value of the traffic being sent their and value is not always solely a "hits" numbers game.

Making It Work: Keeping Them In The Store

Presently, links send surfers away from sites (as well as ads). But what if surfers were not sent away but rather could build a shopping cart within a site using products from other sites? The key here is not to send them away from the site (risking the possibility they won't return) but rather to offer products from other sites in your store in the same way a catalog compiles products from others in their catalog. They never tell you to drop their catalog and go to another one.

One of the key questions here is whether a shopping cart on one site can be built using links from other sites. Sites that review books and are part of Amazon's Associates Program have little Amazon logos under their book reviews. But why send them away via an Amazon click if books might be combined with other products in one shopping cart before check-out time at the old virtual store?

Data-Miners: The New Heroes Of The Web?

Remember when site design strategies were based on flash and little more? With the emergence of technologies and companies such as "collaborative filtering" and Engage, flash gives way to true visitor personalization. After all, when you are getting materials closer and closer to your interests, there is less need for flash. In this sense, it is better to be in a plain-looking "store" that has everything you want than a flashy "store" that has little you want.

A company called Webminer ( is one of a new interesting "breed" of data-mining services which believes in personalization over flash. Their service is involved with matching up visitor demographics (that Webminer has) with logs, cookies and forms the site provides them with.

As Jesus Mena of Webminer notes, "There are a lot of companies that can build nice looking sites but one needs to separate appearance from usefulness to the web surfer/customer." Webminer analyzes client's server data for hidden patterns - in order to extract the signature of their client's most profitable online customers.

After merging the client's log and forms information, Webminer uses neural nets, machine-learning and genetic algorithms for the clustering, segmentation, classification and profiling of client website visitors. As Mena notes, "Webmining is about recognizing customer signatures - the evolving patterns created every time a visitors stops at your website. Our webmining services can provide you a profile of who your customers are and assist you in predicting their online behavior and propensity to buy."

The "Mine Your Own Business" article by Jesus Mena at expands on the above concepts and those discussed throughout this article. It is one of the clearest explanations we have ever seen on data mining web sites and the future of internet marketing. We encourage our readers to take a look. Below is an example from the article.


The following block represents a database of Web site visitors and online transactions (sales) compiled from Customer and Non-Customer records. The blocked clusters represent the records of C (customers) found by the data mining tool. The blocked clusters within the box discovered by the tool can be seen either as a graphical decision tree or as a set of rules. These rules enable Webmasters and management to quickly understand what type of activity is evolving within their Web sites, by providing succinct rules using the data components captured by log and registration files.

IF zip code (94121-94123)
AND age (45-49)
AND gender Male
THEN /Websell/Product#9.htm 87%

In plain English, this rule translates as, "If a visitor is from zip codes 94121, 94122, and 94123, and is aged 45 to 49, and is male, then there is a high likelihood (87 percent) that he will request Product#9.htm (one of the main pages for this commercial Web site)."

Off-Line Demographics + Log, Forms, Cookie Demographics

Notice in the above that traditional "off-site" demographic information is used to define the web surfer. By "off-site" we mean activities not necessarily web related such as zip code, age and gender. Although it has been shown that zip, age and gender do have correlations to web use, these are not based on web click activity so we refer to this demographic data as "off site" rather than "on site" data generated exclusively via click activity on the internet.

We suggest that "on-site" demographics can be used to build product selections in something similar to the following example.


IF Site Pages Visited (Page 3, Page 4, Page 8)
AND Clicks (Page 3: Link 8, Page 4: Link 7, Page 8: Link 2)
THEN Websell/Product #10 87%

In plain English, this rule translates as, "If a visitor has visited the following pages in your site, and clicked on particular links in these pages, then there is a high likelihood (87 percent) that he will be interested in Product #10

Present Session On-Line Demographics

Note that the above represents information relational database material would have to choose to build a virtual store with Product #10 in it. All of this activity of visiting other pages and links in the site could take place within a matter of minutes before having the link with Product #10 appear. The more activity on the page the more of the "If/And" on-line activity and the better definition of the surfer on the site. In this sense, pages built farther into the page visit become closer and closer to the surfer's needs and, theoretically, should have higher sales ratios.

The above does not preclude the mixing of "off-line" demographics with "on-line" demographics. Rather, it simply brings forward that demographics need to be created from both on-line activity and off-line activity. But online activity history ultimately is far greater than just the patterns within a particular site visit. It might be a person's overall click history on the internet (or at least whatever is capable of being captured and stored/archived for retrieval).

We argue that just as zip-codes define where people physically live for the purpose of market segmentation such as CACIs Acorn system, web surfers have historical click patterns that also give them a type of internet "zip code" and defines them by pattern of "places" (sites) they have visited and the frequency of these visits. CACIs Acorn system segments American zip codes into approximately 43 patterns which are linked to types of products. The same can be done with link patterns of internet users to develop a number of segments. This suggests three categories of marketing information which might be "mined" from web surfers: 1) off-line traditional demographic information 2) current site activity information and 3) overall web history information.


Demographic Information (Past Off-Line Activity)
IF zip code (94121-94123)
AND age (45-49)
AND gender Male
----- Merge With ------
Site Specific Information (Current On-Line Activity)
IF Site Pages Visited (Page 3, Page 4, Page 8)
AND Clicks (Page 3: Link 8, Page 4: Link 7, Page 8: Link 2)
----- Merge With ------
Web Click History Information (Past On-Line Activity)
IF Overall Site Visit History = Pattern #9
---------Then ---------
Websell/Product #20

Off-Line Demographics + Present Session & Past History On-Line Demographics

What Example #3 above does is match demographic history, web history and site activity to create true virtual stores targeted to who the particular surfer is. In effect, when a Pattern #9 surfer comes to a particular site, the site will know that the surfer is either some "wild cowboy" or a "little old lady" who is likely to drive a Nash Rambler. Once the site knows this information up front when the visitor enters the site, it can start matching this with the activity of the surfer on the site.

These same type of data-mining concepts are being used to pan for gold (and find new patterns of relationships) within large corporate databases. However, no where except on the web is there a possibility to surround the customer immediately with types of products they have said they need through activity (rather than outmoded focus groups and questionnaires).

With the matching of on-line and off-line activity data, there will exist the possibility of web sites becoming almost an extension of the surfer. One of the areas affected most likely will be banner advertising. Why do you need ads directing to you to other locations when the location has in effect already come to you via increasing customization? Why not immediately sell the product when you have a captive audience rather than send them off to another site? The theory and promise can be visualized today. But the reality will be in the future. Whether we want all of this to come remains to be seen and most likely battled out in court under constitutional issues. One thing is sure, though. It all is a definite possibility and in fact close to the promise of the internet in interactivity. Yes, its a little ways in the future but this future has a tendency to arrive before we have dinner set.

Is that the site doorbell?

Probably another Pattern #9 trying to pick up some old Beatles memorabilia.

Might as well ask him to stay for dinner.

© 1998 - John Fraim

John Fraim is President of GreatHouse Company a research, consulting and publishing firm centered around the symbolism of popular culture. His articles have been published in a number of leading publications. His book Spirit Catcher won the 1997 Small Press Award for best biography. His email is This email address is being protected from spambots. You need JavaScript enabled to view it. . Visit the CyberBeacon Café at

Other Articles at The Jung Page:

The Symbolism of UFOs and Aliens (11/98):
The Subliminal Persuasion of Contact (9/97)
Visionary Rumors And The Symbolism of The Psychoanalytic Movement (6/95):