Not Just Google: Regulators Shut Down Bing, Yahoo Mortgage Scam Ads


The fake mortgage programs that brought Google under recent scrutiny don’t seem to be run by Google-exclusive con artists. Mortgage scams that claim to provide assistance from the U.S. government have also been located in Microsoft adCenter. Microsoft has cooperated in banning the associated advertisers from Bing and Yahoo search results.

According to the statement from SIGTARP(the Special Inspector General for the Troubled Asset Relief Program), at least 125 illicit companies of this nature were identified. The investigation, however, remains ongoing.

Microsoft, meanwhile, has extended its cooperation in much the same way Google did earlier on in the investigation. Microsoft proactively shut down the accounts of more than 400 advertisers that were connected to the 125 mortgage scam companies.

“Many homeowners who have fallen prey to these scams were enticed by web banner ads and online search advertisements that promised, for a fee, to help lower mortgage payments,” according to Christy Romero, Deputy Special Inspector General for SIGTARP.

In addition to working with Google and Microsoft to shut down these ads (which Romero states will “dramatically decrease the scope and scale of these scams”), SIGTARP hopes to educate home owners.

According to the official SIGTARP statement, “Homeowners can protect themselves from becoming a victim of these scams by seeking a HAMP mortgage modification directly through their lender or mortgage servicer or through HUD-approved housing counselors.” The HAMP program is free of charge and approved by the U.S. government.

Having Microsoft fall to the same issue re-raises the question of ad platform responsibility. Should Microsoft have spotted these ads prior to posting them on Bing and Yahoo? Is it likely they ignored them for profit alone? Or is it simply too much to expect a company to vett their potential ads so thoroughly? Leave your thoughts in the comments.


Volunia Search Engine: A ‘Radical New View’ of Search


Volunia, an Italy-based search engine, is preparing for its early beta stages. It promises to be a “radical new view” of search, and is being spearheaded by famed technology thinker Massimo Marchiori, PC World reported.

A New Search Concept from the Father of PageRank

While Larry Page and Sergey Brin certainly deserve credit for their effective implementation of the PageRank concept, the Google founders have had no issues in crediting Massimo Marchiori for the idea. It was Marchiori’s HyperSearch concept, presented at a 1996 conference, that sparked the idea for PageRank as we now think of it.

Since Marchiori has come across at least one revolutionary search concept in the past, it’s worth an eyebrow raise that he claims his latest project “[is] a different perspective. It’s a new radical view of what a search engine of the future could be.” While excited about the project, Marchiori has declined to release any further details. He states that Google “would have no difficulty in setting 100 engineers to work day and night on our idea and in coming out before us.”

What we do know is that Marchiori’s new search project is called “Volunia,” and will be (in Marchiori’s terms) a “fencer’s foil” to Google’s “club.” The Volunia site is currently seeking applicants for “power user” positions. These power users would be granted early beta access and direct lines of communication to developers.

Those interested in learning more about this highly veiled search can watch the videos posted on the Volunia site (such as the one embedded above) and check back in with the site for future updates. While the site is approaching beta, there is still no launch date or even an estimated launch year.

Mariano Pireddu, an entrepreneur with a focus in IT, is funding the project. Pireddu clarifies that only he and Marchiori are currently partners, and that “We have sufficient funds to finance the early stages of the venture. It’s premature to talk about the possible entry of new investors.” The company is working with a staff of developers and engineers, however, including engineers who were taught by Marchiori at the University of Padua.

Marchiori hasn’t sat still since the 1996 HyperSearch presentation. His work with the Platform for Privacy Preferences (P3P), the Web Ontology Language (OWL), and the World Wide Web Consortium (W3C) have continued to make a stir over the last decade.


Introducing SEOmoz’s Updated Page Authority and Domain Authority

Here at Moz, we take metrics and analytics seriously and work hard to ensure that our metrics are first rate. Among our most important link metrics are Page Authority and Domain Authority. Accordingly, we have been working to improve these so that they more accurately reflect a given page or domain’s ability to rank in search results. This blog entry provides an overview of these metrics and introduces our new Authority models with a deep technical dive.

What are Page and Domain Authority?

Page and Domain Authority are machine learning ranking models that predict the likelihood of a single page or domain to rank in search results, regardless of page content. Their input is the 41 link metrics available in our Linkscape URL Metrics API call and their output is a score on a scale from 1 to 100. They are keyword agnostic because they do not use any information about the page content.

Why are Page and Domain Authority being updated?

Since these models predict search engine position, it is important to update them periodically to capture changes in the search engines’ ranking algorithms. In addition, this update includes some changes to the underlying models resulting in increased accuracy. Our favorite measure of accuracy is the mean Spearman Correlation over a collection of SERPs. The next chart compares the correlations on several previous indices and the next index release (Index 47).

The new model out performs the old model on the same data using the top 30 search results, and performs better if more results are used (top 50). Note that these are out of sample predictions.

When will the models change? Will this affect my scores?

The models will be updated when we roll out the next Linkscape index update, sometime during the week of November 28. Your scores will likely change a little, and may potentially change by as many as 20 points or more. I’ll present some data later in this post that shows most PRO and Free Trial members with campaigns will see a slight increase in their Page Authority.

What does this mean if I use Page Authority and Domain Authority data?

First, the metrics will be better at predicting search position, and Page Authority will remain the single highest correlated metric with search position that we have seen (including mozRank and the other 100+ metrics we examined in our Search Engine Ranking Factors study). However, since we don’t yet have a good web spam scoring system, sites that manipulate search engines will slip by us (and look like an outlier), so a human review is still wise.

Before presenting some details of the models, I’d like to illustrate what we mean by a “machine learning ranking model.” The table below shows the top 26 results for the keyword “pumpkin recipes” with a few of our Linkscape metrics (Google-US search engine; this is from an older data set and older index, but serves as a good illustration).

Pumpkin Recipes SERP result

As you can see, there is quite a spread among the different metrics illustrated, with some of the pages having a few links and others 1,000+ links. The Linking Root Domains are also spread from only 46 Linking Root Domains to 200,000+. The Page Authority model takes these link metrics as input (plus 36 other link metrics not shown) and predicts the SERP ordering. Since it only takes into account link metrics (and explicitly ignores any page or keyword content), but search engines take many ranking factors into consideration, the model cannot be 100% accurate. Indeed, in this SERP, the top result benefits from an exact domain match to the keyword and helps explain its #1 position despite its relatively low link metrics. However, since Page Authority only takes link metrics as input, it is a single aggregate score that explains how likely a page is to rank in search based only on links. Domain Authority is similar for domain wide ranking. The models are trained on a large collection of Google-US SERP results.

Despite restricting to only link metrics, the new Page and Domain Authority models do a good job of predicting SERP ordering and improve substantially over the existing models. This increased accuracy is due in part to the new model’s ability to better separate pages with moderate Page Authority values into higher and lower scores.

This chart shows the distribution of the Page Authority values for the new and old models over a data set generated from 10,000+ SERPs that includes 200,000+ unique pages (similar to the one used in our Search Engine Ranking Factors). As you can see, the new model has “fatter tails” and moves some of the pages with moderate scores to higher and lower values resulting in better discriminating power. The average Page Authority for both sets is about the same, but the new model has a higher standard deviation, consistent with a larger spread. In addition to the smaller SERP data set, this larger spread is also present in our entire 40+ billion page index (plotted with the logarithm of page/domain count to see the details in the tails):

One interesting comparison is the change in Page Authority for the domains, subdomains and sub-folders PRO and Free Trial members are tracking in our campaign based tools.

The top left panel in the chart shows that the new model shifts the distribution of Page Authority for the active domains, subdomains and sub-folders to the right. The distribution of the change in Page Authority is included in the top right panel, and shows that most of the campaigns have a small increase in their scores (average increase is 3.7), with some sites increasing by 20 points or more. A scatter plot of the individual campaign changes is illustrated in the bottom panel, and shows that 82% of the active domains, subdomains and sub-folders will see an increase in their Page Authority (these are the dots above the gray line). It should be noted that these comparisons are based solely on changes in the model, and any additional links that these campaigns have acquired since the last index update will act to increase the scores (and conversely, any links that have been dropped will act to decrease scores).

The remainder of this post provides more detail about these metrics. To sum up this first part, the models underlying the Page and Domain Authority metrics will be updated with the next Linkscape index update. This will improve their ability to predict search position, due in part to the new model’s better ability to separate pages based on their link profiles. Page Authority will remain the single highest correlated metric with search position that we have seen.


The rest of the post provides a deeper look at these models, and a lot of what follows is quite technical. Fortunately, none of this information is needed to actually use these Authority scores (just as understanding the details of Google’s search algorithm is not necessary to use it). However, if you are curious about some of the details then read on.

The previous discussion has centered around distributions of Page Authority across a set of pages. To gain a better understanding of the models’ characteristics, we need to explore its behavior on the inputs. However, the inputs are a 41 dimensional space and it’s impossible (for me at least!) to visualize anything in 41 dimensions. As an alternative, we can attempt to reduce the dimensionality to something more manageable. The intuition here is that pages that have a lot of links probably have a lot of external links, followed links, a high mozRank, etc. Domains that have a lot of linking root domains probably have a lot of linking IPs, linking subdomains, a high domain mozRank, etc. One approach we could take is simply to select a subset of metrics (like the table in the “pumpkin recipes” SERP above) and examine those. However, this throws away the information from the other metrics and will inherently be more noisy then something that uses all of them. Principal Component Analysis (PCA) is an alternate approach that uses all of the data. Before diving into the PCA decomposition of the data, I’ll take a step back and explain what PCA is with an example.

Principal Component Analysis is a technique that reduces dimensionality by projecting the data onto Principal Components (PC) that explain most of the variability in the original data.  This figure illustrates PCA on a small two dimensional data set:

This sample data looks roughly like an ellipse. PCA computes two principal components illustrated by the red lines and labeled in the graph that roughly align with the axes of the ellipse.& One representation of the data is the familiar (x, y) coordinates. A second, equivalent representation is the projection of this data onto the principal components illustrated by the labeled points. Take the upper point (7.5, 6). Given these two values, it’s hard to determine where it is in the ellipse. However, if we project it onto the PCs we get (4.5, 1.2)which tells us that it is far to the right of the center along the main axis (the 4.5 value) and a little up along the second axis (the 1.2 value).

We can do the same thing with the link metrics, only instead of using two inputs we use all 41 inputs. After doing so, something remarkable happens:

Two principal components naturally emerge that collectively explain 88% of the covariance in the original data! Put another way, almost all of the data lies in some sort of strange ellipse in our 41 dimensional space. Moreover, these PCs have a very natural link to our intuition. The first PC, which I’ll call the Domain/Subdomain PC projects strongly onto the domain and subdomain related metrics (upper panel, blue and red lines), and has a very small projection onto the page metric (upper panel green lines). The second PC has the opposite property and projects strongly onto page related metrics with a small projection onto Domain/Subdomain metrics.

Don’t worry if you didn’t follow all of that technical mumbo jumbo in the last few paragraphs. Here’s the key point: instead of talking about number of links, followed external links to domains, linking root domains, etc. we can instead talk about just two things – an aggregate domain/subdomain link metric and an aggregate page link metric and recover most of the information in the original 41 metrics.

Armed with this new knowledge, we can revisit the 10K SERP data and analyze it in with these aggregate metrics.

This chart shows the joint distribution of the 10K SERP data projected onto these PCs, along with the marginal distribution of each on the top and right hand side. At the bottom left side of the chart are pages with low values for each PC signifying that the page doesn’t have many links and they are on domains without many links. There aren’t many of these in the SERP data since these are unlikely to rank in search results. In the upper right are heavily linked to pages on heavily linked to domains, the most popular pages on the internet. Again, there aren’t many of these pages in the SERP data because there aren’t many of them on the internet (e.g.,, etc.) Interestingly, most of the SERP data falls into one of two distinct clusters. By examining the follow figure we can identify these clusters:

This chart shows the average folder depth of each search result, where folder depth is defined as the number of slashes (/) after the home page (with 1 defined to be the home page). By comparing with the previous chart, we can identify the two distinct clusters as home pages and pages deep on heavily linked to domains.

To circle back to search position, we can plot the average search position:

We see a general trend toward higher search position as the aggregate page and domain metrics increase. This data set only collected the top 30 results for each keyword, so values of average search position greater than 16 are in the bottom half of our data set. Finally, we can visually confirm that our Page and Domain Authority models capture this behavior and gain further insight into the new vs old model differences:

This is a dense figure, but here are the most important pieces. First, Page Authority captures the overall behavior seen in the Average Search position plot, with higher scores for pages that rank higher and lower scores for pages that rank lower (top left). Second, comparing the old vs new models, we see that the new model predicts higher scores for the most heavily linked to pages and lower scores for the least heavily linked to pages, consistent with our previous observation that the new model does a better job discriminating among pages.


How To Create SEO-Friendly Content

Getting your voice heard on the internet is never easy. It can be like setting up your soapbox on a crowded street, and waving frantically to get passers-by to pay attention. You might be an expert in your chosen topic, with pearls of wisdom to dispense on X particles or Z-list celebrities, but how do you get people to stop long enough to listen?

The answer is finding the right balance between SEO-friendly content and readability.  It’s essential to make sure the Google (and Bing) spiders – and therefore readers – can find your website or blog. Knowing a few tricks can help you climb their rankings, without sacrificing your sparkling writing or specialist knowledge.

Choose Your Keywords

Keywords are the most important aspect of SEO , so think about them before you even start to write. It can be hard slotting keywords in afterwards without sounding clunky and forced.

Brainstorm words and phrases you think people are looking for, and use trusted tools such as GoogleAdWords to help pick the best. Consider how much competition there is for each phrase.  Instead of catch-all terms such as “travel agency” consider more specific terms, such as “Italian luxury travel specialist”, to sell your particular area of expertise.

Place Keywords Carefully

Search engines don’t just analyze which words you use, but where you place them. Getting keywords in the title or first sentence is obviously a good start. Many newspapers change their pun-heavy headlines to more SEO-friendly versions on their websites.

You need to find the right keyword density – “keyword stuffing” can be penalized by search engines, as well as being a turn-off to readers!

Don’t forget “hidden places” to put your keywords, such as meta tags and image captions.

Use Free Tools

Take advantage of free tools, such as Google Analytics which can assess where your site traffic comes from and which aspects need more work.  For bloggers, WordPress has various plug-ins that can help you choose the best post title and keywords, avoid duplicate content and make the most of meta-tags. Mashable has a list of the top 20 WordPress SEO plug-ins.

Become An Expert on Your Topic

Let’s say users are searching for “Edinburgh travel tips” or “easy Christmas recipes”. If you’ve got several articles on the same topic, then search engines will assume you know what you’re talking about. Choosing a targeted area of expertise will help you get on that coveted first page of search results.

Use Links Wisely

It’s not just what you write that counts. Clever use of links will help your site climb the rankings – including ones to other parts of your own website. If another page has relevant information or you’ve written a similar post in the past then add a link. Just don’t overdo it!

Wear the right Coloured Hat

SEO techniques are sometimes referred to as “white hat” or “black hat”. Search engines regard “white hat” techniques as legitimate ways to optimize your website and help users find the information they want. “Black hat” techniques refer to practices such as using hidden text, or having separate versions of websites to deceive search engines. They might work in the short term, but could lead to Google blocking your site – not a good strategy!

Write for Your Audience

“Content is king” may be a cliché, but it’s basically true. SEO techniques can grab readers, but engaging writing keeps them there. Don’t let your text get so loaded down with keywords your main points get lost. Giving away useful information or creating a lively, informative blog is the best way to keep readers coming back to your site.  Think of your audience. Are they interested enough to plough through a long piece of text? Or do they just want the basic facts as quickly  and succinctly as possible?

Make it Readable

And on the same theme, make sure your writing is easy on the eye. Break up chunks of text into subcategories, and use images effectively. Lists can be a good idea – and a way to repeat keywords without readers noticing! Use short sentences and leave plenty of white space.


Bing And Yahoo Advertisers Get New Tools

Microsoft is talking about some new features it has for adCenter.

“Over the last two weeks, adCenter has released its latest round of pre-holiday features, all delivering on advertisers’ wish lists of improving campaign performance, increasing volume, and simplifying processes to help save time,” a spokesperson tells WebProNews.

Features include a redesigned web user interface, an upgrade to the adCenter Desktop, and the release of several performance reporting tools.

Microsoft outlines each of these.

The interface:

  • Simplified Campaign Set Up for creating campaigns and ad groups, and a sleek, new single-page view with real-time previews and keyword suggestion, enabling quicker campaign deployment.
  • Improvements to Navigation & Discovery to help advertisers manage across their entire account by viewing and editing keywords and ads across multiple campaigns and ad-groups at once.
  • Improvements to Campaign Reporting with new multi-metric trend charts, delivery status notification features and positional bid estimates.
  • Improvements to Editing with in-line editing, in-line bid editing, and best position estimations in the keywords grid.

The desktop:

  • New Welcome Screen takes advertisers on an end-to-end tour of the Desktop tool to help them get set up and started quickly.
  • An expanded Import Campaigns feature to allow advertisers to easily and directly import their Google AdWords campaign data into the Desktop.
  • Clipboard support to enable basic copy and paste functionality so that advertisers can quickly and easily copy data and move it to, from, and within Desktop.
  • Bulk bid suggestions to offer more than 1,000 keywords and let advertisers easily apply changes in order to increase traffic.
  • Simplified Targeting with a default set to the advertisers’ account location, determined by the language listed in their Desktop settings.


  • New, improved Opportunities Tab that includes bid suggestions for exact/broad match and in-line editing. With this new feature advertisers can easily address underperforming bids to target more volume.
  • New Share of Voice feature that quantifies missed impressions in Account, Campaign, and Ad Group performance reports, and helps prioritize optimizations more effectively.
  • Improved historical and aggregated Quality Score data to allow advertisers to view aggregated quality score by summary or by time frame, including hour, day, week or month.
  • An upgrade to Change History reports so advertisers can view targeting changes and gain better insights into campaign performance related to those changes.

For those of you who aren’t advertising with adCenter, remember that these things apply to Bing and Yahoo advertisers.


89% of U.S. Companies Will Use Social Networks For Recruiting

Jobvite released an interesting report looking at employers’ use of social media in the recruiting process. The biggest takeaway, as illustrated above, is that 89% of U.S. companies will use social networks for recruiting.

That’s a lot. However, it also means that 11% aren’t even bothering to check out LinkedIn, which I find pretty interesting, given how much professional information there is on the network.

Some other key findings:

  • 55% will increase their budgets for social recruiting; referrals, corporate career sites and direct sourcing are other top categories for increased investment.
  • Referrals, direct sourcing and social networks are the top rated external sources for quality candidates.
  • Only 16% will spend more on job boards and a third of respondents plan to spend less on job boards, third party recruiters and search firms
  • LinkedIn has led in recruiting usage each year and now almost all of those surveyed (87%) use the professional network, up from 78% last year.
  • Recruiting usage of other major networks stayed fairly steady with 55% using Facebook and 47% using Twitter.
  • But now, most (64%) have expanded their social recruiting programs to two or more social media channels; and 40% use all three top networks – LinkedIn, Facebook and Twitter.
  • 77% of survey respondents expect increased competition for talent.
  • Nearly 2/3 of companies intend to recruit from competitors in the year ahead.
  • Among companies anticipating increased hiring this year, 95% now use or plan to start using social recruiting.