How does the uniqueness of content affect? Why is unique content important? Get free lessons and tips on internet marketing

07.11.2019 Windows and disks

Text uniqueness is from 95%. All webmasters make this requirement of copywriters. For the past two years, unique content has been the most discussed topic in the SEO community.

Getting caught by a filter, being banned, dropping traffic – any misfortune that happens to a site is attributed by webmasters to the use of non-unique content. Is it so? Let’s figure out what webmasters are so afraid of and whether it’s worth it.

What is text uniqueness and how to check it

When people talk about the uniqueness of a website’s content, they most often mean text. To understand what uniqueness is and how it is checked, let’s get acquainted with the term shingle.

Shingle is a piece of text, a sequence of words (not a sentence) that programs use to check uniqueness.

Unique text is a set of shingles that are not found in the text of other documents on the network. For effective verification, shingles of 5 words are used.

How does the verification take place?

At the first stage, the program breaks the text into shingles and checks each of them for matches in the network.

Of course, she finds many matches, because there are millions of documents on the Internet. In this text, 75% of the shingles have already been used by someone. But this does not mean that the text is plagiarism.

At the second stage, the program compares groups of shingles of the text being checked with shingles of the text of documents on the network. If a text has at least 10 identical shingles, then it comes under suspicion.

Then the suspected text undergoes a thorough check - sentences are compared, the order of words in them is compared, synonyms are found.

All borrowed parts of the text are separated into a separate group. The program calculates their percentage of the entire text, subtracts them from 100% and displays the result.

What are the risks for a site when using non-unique content?

A high percentage of non-unique content on the site leads to dire consequences.

Search engines impose filters and penalties on websites. For copy-paste or a high percentage of non-unique content on a resource, you can get the following filters:

  • AGS from Yandex. Only home page. Basically, with this filter, the PS punishes only sites with blatant copy-paste or synonymization.
  • Filter "You're the last one." PS puts it on one specific page with a low percentage of uniqueness and lowers it in the search results.

Often, a page with copy-paste posted on a long-standing resource is not only easily indexed and appears in the search results, but also appears in the TOP. But on new sites, content with uniqueness below 80% ranks poorly. It is impossible to promote a page with non-unique text to the TOP. Therefore, webmasters of new sites are very scrupulous in the matter of selecting content.

What else should webmasters of new sites be wary of?

The search engine can simply ignore the non-unique content of the site and not add its pages to the search results. If you don’t have a website and are just planning to create one, then I recommend turning to sitemania.com.ua. From them you can order a website of any complexity.

It happens like this. The robot comes to your site for the first time to get acquainted with its content. He checks several pages, determines that the text on them is not unique, concludes that the entire site is like that - he simply leaves.

Therefore, until the robot indexes the site, add only unique content.

The importance of unique content for a site

Search engines have begun to pay more attention to the uniqueness of content. Five years ago, an optimizer could use links to promote a page even without text content. Today, optimized text is the most important tool for promotion.

Is it worth achieving maximum uniqueness of the text? Does it have a significant impact on promotion?

Unique text:

  • does not bring the page closer to the TOP of search results (for this, the text must be optimized);
  • does not simplify the work of external promotion;
  • does not improve behavioral factors. People cannot tell the difference between copy-paste and an original article.

Unique content on the site is trust search engines, the confidence that they will index all pages of the site will not impose AGS filters and "You're the last one."

Non-unique content on the site means there is a high probability that the project will die without even being born.

Sometimes you still have to read on forums that the uniqueness of the content is of secondary importance and receive letters asking you to figure out why the site crashed, with a note that the content “unique AF by 60%”. In fact the importance of unique content should not be underestimated. If previously you could immediately get an AGS for a non-unique article in Yandex, then lately I have increasingly encountered situations where for unoriginal articles on a site it can simply go down significantly without losing pages in the index.

I'll tell you our situation. There is a client’s website (the domain is 6 years old), which is promoted according to exactly 200 requests (MF, LF, HF - everything is there), requests are evenly scattered across different pages(about 15). When the site came to us, we checked all pages for uniqueness (not just promoted ones - all), rewrote all non-original texts and began promotion. During the required period, the site reached the top for 185 out of 200 requests (according to the contract, we guaranteed 140) and stayed there for three months, until a sharp collapse occurred about a month ago:

We analyzed everything - texts, internal optimization, links - everything was ok, except for the uniqueness of the content - it turns out that our articles had already been stolen all over the RuNet, and the average uniqueness of content on the site dropped to 63%. The content of half the site's pages was less than 30% unique. M We decided not to touch internal optimization and links, and experiment only with texts.

We rewrote non-unique content, adhering to the old indicators of nausea and occurrence density, without affecting linking, external links and anchors. This is the result we got in the next update - immediately after the new texts entered the Yandex index:

I repeat that other than replacing non-unique texts, we did not perform any other actions with the site. This proves that the site was lowered in search results precisely because of a decrease in its uniqueness. As you can see from the screen, after replacing the non-unique, the site returned to where it was.

There are 2 points that I consider important in the described situation and to which I want to draw your attention:

1) After the site went down, we analyzed all the pages for uniqueness and rewrote the texts not only on those pages that are being promoted, but also on those that are not participating in the promotion. There is complete confidence that the ranking of a site is influenced not only by the quality of the content of the promoted pages, but by the quality of the content of the entire site as a whole. So if you have a couple of hundred non-unique pages on your site and you only rewrite the 10 promoted ones, this will not help.

2) When we checked the content for uniqueness, it was noticed that Yandex identified all our pages with non-unique content as the original source (). Conclusion - a site can lose positions due to non-unique content, even if it is considered the original source.

Unfortunately, there is no way to protect your content from theft 100%. You can reduce the number of copy-pastors by disabling the ability to select and copy texts in your code. You can call plagiarists and ask them to remove the text (sometimes it helps). You can install Copyscape.com to scare away “live” copy-pastors or display a pop-up window with a warning when you try to copy text. This will reduce the number of duplicates of your texts on the RuNet, but will not get rid of them forever - some of the texts will always have to be unique again and again on your own.

The importance of content is difficult to overestimate. If a site contains a large amount of quality content, then the site is doomed to success. But everything would be fine if not for one thing: content must be your own. Your own, unique, interesting and useful. And here on the word “ unique» 90% of those who have discovered the beauty of the content are cut off.

And really, why invent something if everything has already been invented? The Internet is big, and everything is already written there, why reinvent the wheel? We take a piece of text from Wikipedia, a bunch of paragraphs from a competitor’s website, a pinch of beautiful phrases from sites from the top search results and decorate with photographs from Google Images. Links to sources? No, have not heard. That's all, the article is ready. Welcome to the world of modern copywriting!

Unique content

Unique content is the basis of the Internet. There is also communication, but this topic is beyond the scope. That is, Internet users, in addition to the opportunity to communicate, go to the Internet to gain access to some information that interests them. Finding content is exactly what search engines are for, and what search engines value above all else. Yandex openly says that the main thing for Yandex is content. Analogy for Google - "Content is King". Accordingly, search engines value most those who regularly deliver unique and relevant content.

Question, what is considered unique content, has long been exciting the fantasies of people who want to protect their intellectual rights or want to profit from the work of others. I don’t want to get into controversy, but creating something from scratch is almost impossible. To create something, you need to create it from something. That is, in any case, something new appears on the basis of something already existing, and to declare: “I created this!” is at least strange.

However, this does not mean at all that labor, time and effort should not be rewarded, much less borrowed. Therefore, the question is not so much about protecting rights or even creating difficulties for using other people’s content, but about speeding up and simplifying its indexing, that is, in recognizing the content as your original source.

Non-unique texts

Content is assessed for potential applicability and usefulness, which this content can bring. But if we are talking about content from the point of view of search engines, then the UNIQUENESS of the content is added to the potential applicability of the content. Therefore, this begs the question, who determines the uniqueness of content? Indeed, the uniqueness of content is a comparative concept.

So who, how, what and with what compares? They compare search engines, compare new content with already indexed content. Roughly speaking, the one whose text was indexed first is the one whose text is original. I repeat, roughly speaking, the original source of content is considered to be the resource on which this content was first discovered. Rough because they apply to different types of content various ways analysis to determine the primary source. It can be assumed that the original source of content may change depending on the data accumulated about the content and sources of content and the state of these sources.

Non-unique pictures

Let's take images for example. Today the search engine found a new picture with a resolution of 640x480 on one site, and tomorrow the same picture with high resolution 800x600 on another site. Who is the original source? It depends on and, in fact, the search engine itself that found these pictures.

Content on the Internet is freely available, and all users can do whatever they want with this content. This is true, in essence. Of course, someone can claim that this is their picture and begin proceedings regarding the unlawful use of copyright material. But the ability to use this content itself will not go away.

Therefore, no one can be sure that 100% of the content he creates will be recognized as 100% of his authorship. And the © icon won't help.

Stolen content

Texts are stolen. Photos, pictures and all types of images are also stolen. Videos are being stolen. Music is being stolen. They also steal oil, gas, timber, people, seals, love, freedom and independence. Everyone steals. You need to understand this, accept it and think about how to resist it, especially since they have already really thought and come up with it for you. Why not take advantage? 🙂

I won't list everything possible ways combating content theft (if you really want to, write in the comments, we can write a separate article about this). I'll try to explain general principles placement and primary protection of content on the Internet.

Basic principles

The first and most important principle— this is the maximum uniqueness of the content. It is clear that the letters in the alphabet limited quantity, and there are only three colors (ok, there is also black and white). But, every text has a unique logical structure, and if the text is written by a person, then the logical structure and manner of writing become a unique imprint. And it is impossible to create two absolutely identical photographs.

Conclusion: By creating content yourself, the likelihood of significant matches approaches zero.

The second important principle is indexing speed. The faster a search engine finds and indexes content, the faster its source will be identified. For example, you are actively blogging, but search engines, for one reason or another, do not index your site well. Someone whose site is indexed better (faster) begins to steal your content with banal copy-paste and place it on their site. If your content is indexed faster on someone else's site, it is not your content. From the point of view of search engines, the primary source will be the site where your article was first found. And it turns out you stole the article.

Conclusion: High indexing speed is your best friend.

Yandex.Webmaster - Original texts

This is a service with which you can notify Yandex about the appearance of original text on the site.

Quote: If you publish original texts on your website, and they are reprinted by other Internet resources, notify Yandex about the imminent release of the text. We will know that original text first appeared on your site, and we’ll try to use this in setting up search algorithms.

There are many ways to prevent misuse of your content. But for each of them there are several ways to get around them. And if someone is known to systematically steal your content, you can request that your content be removed from the third-party site or initiate legal proceedings. But practice shows that if a third-party resource does not remove the content voluntarily, then trying to achieve this through the court may cost more than the damage from the theft of the content.

Post your own unique content. Think about how, when and where to post content. And you will be happy :)

Repost:

Get free lessons and tips on internet marketing

    Sergey Burykh

    Hello!
    Example: I am a startuper. I opened an online store, at first I do not have the physical ability to post unique content. And of course I do a lot of copy-paste. But, for example, things went uphill, I had resources, and I decided to improve the situation. And he began to systematically, well, at least rewrite the texts. Will search engines index them?
    And another question, what percentage of uniqueness can be considered suitable? for example, is 70% enough, or is 99 needed, no less?
    Thank you)

    1. Anton Soshnikov

      1. “...I do not have the physical ability to post unique content.” If a site has nothing to offer to a search engine, then it has nothing to do in search results.
      2. “... began to systematically, well, at least rewrite the texts. Will search engines index them? Search engines will index in any case.
      3. “...what percentage of uniqueness...”. Percentage relative to what? Do you know exactly how search engines determine the uniqueness of text on a website and how to express this as a percentage?
      4. “...can be considered suitable? for example, 70% is enough...” “Fit” and “sufficient” for what?

      Based on the example you gave: You are trying to plug a hole in the site under the fashionable name SEO, not fully understanding what it is and why. SEO is a large complex of interrelated factors, and text rewriting alone will not make the difference.

      1. Sergey Burykh

        I understand that the SEO hole is large, and there are many factors influencing its size. And work on them is underway. Let's take a hypothetical situation that everyone technical points more or less resolved. But the content remained the same, that is, non-unique. So I take, rewrite the text, check it on Advego Plagiarism, or in another way, and the program tells me that it is 70% unique. And it highlights sections of text that already exist on other sites. And here I just wanted to ask the professionals a question: “how do search engines determine the uniqueness of text on a site and how to express this as a percentage?” I understand that these algorithms are probably unknown. But I at least want to understand the principle. And “suitable” means that the text is perceived by search engines as unique. It's a bit confusing, but I hope it's clear at least a little)

        1. Anton Soshnikov

          There is no clear value that could characterize originality specific text through the eyes of a search engine. Therefore, there is no point in focusing on this. That is, literally we should not think about the originality of the content if we ourselves are the source of the content. This means that a rewrite of the text, as a free retelling, can be considered original. Finding a percentage match with the source text is good because it allows you to move further away from the source. But here, too, everything is not so simple. Search engines like Google or Yandex perfectly recognize synonymization and read the logic of the text. Ideally, when rewriting, the logical structures in the text should change, and instead of synonyms it is better to use logical synonyms (you can always describe the same thing in different words and with different meanings). Google wrote somewhere that what is important to it is not so much the uniqueness of the text, but the unique opinion that a person expresses using the text. Therefore, I wrote above about a FREE retelling of the source, namely personalized rewriting. Within an online store, the task can be more difficult, since the volume of text descriptions is usually small and it is difficult to express originality in them. But the essence of the approach does not change. If you make original, unusual and more informative text descriptions, you will definitely see that they will rank better than the standard descriptions of competitors.

  • This post will not talk about the impact of the uniqueness of the content of a specific promoted page, but the impact of the uniqueness of the content on the site as a whole on the position of specific promoted pages in Yandex search results.

    It is clear that the promoted article must be unique and optimized for key queries, but as I said above, here we are not talking about the impact of the uniqueness of the text of a specific promoted page, but about the impact of the uniqueness of the content on the site as a whole on the positions of the promoted pages.

    Below I described the history of one of my sites, on which I felt the influence of non-unique content on the site on the positions of the promoted unique pages of the same site.

    There is a website. As usual, I promote some pages of this site in the PS for specific requests. Promoted pages have unique content and are optimized for specific queries. There are links from sapa and from free sources.

    For a long time, almost all promoted pages were in the top 10 of Yandex for most of the queries that interested me. At some point, I stopped tracking the position of pages in the PS results. The pages consistently remained in the top 10 for relevant queries, and that suited me quite well. I didn’t buy any new links or change anything.

    I didn’t specifically track changes in requests; I only occasionally checked requests manually. Requests jumped, but not significantly. At some point, I noticed that many requests very noticeably slipped to the 2-4th pages of Yandex results. I couldn’t understand why, there were many sites in the search results that did not have keywords in the title, had unoptimized content, and hardly had incoming links, but at the same time they were higher than my pages.

    I could not understand the reason for such a drop in positions. There was an idea that the drop in positions was related to sapa links. I thought most of the links just stopped working. I installed a plugin for checking links in SAP, removed bad links, bought more links, but the positions remained in place, and some continued to slide down.

    But there was one more point that I forgot about.

    The site in question is student-oriented. At some point, I installed a catalog on the site with ready-made student works from a partner store, which is described in the post. Naturally, the job descriptions were not unique. Thus, on a site containing only about 100 pages, more than 10,000 pages with non-unique content appeared. Yandex indexed these pages and they were included in the index. The site now has more than 10,000 pages in the Yandex index. I was only happy about this and thought about putting the sapa on the site and selling links from this directory.

    But I never hung up the sapa, reason prevailed, I was afraid that the PS would treat the sapa badly and the position would fall even lower. But while I was thinking, the catalog hung there, and the number of non-unique pages in the index only increased. At one point, the site had almost 20,000 pages in the Yandex index.

    At some point, I suggested that the drop in positions in Yandex was associated with this catalogue. Because a lot of non-unique pages appeared on the site. Since I finally decided that I would not put sapa on the site, I decided to delete this directory.

    After I removed the catalog and the catalog pages began to fall out of the Yandex index, I noticed that the positions began to return to the 1-2nd pages of Yandex results. Almost all positions returned to their previous places, and some even became higher. This is despite the fact that not all pages of this directory have yet fallen out of the Yandex index. I think that when all the catalog pages drop out of the index, the positions will rise even higher.

    From this story, I concluded that the uniqueness of the content on the site as a whole affects the positions of specific promoted pages, even if these pages have unique and optimized content. If a site has a lot of non-unique content, then it will be more difficult to promote pages from this site, even if they are optimized and unique.

    After that, I decided to never again add similar directories to sites whose pages are promoted in the PS, or to add them, but be sure to close them from indexing.

    PS In the post, there is one program - Etxt Anti-plagiarism, which allows you to check all texts on the site for uniqueness.

    A post that presents several may also be useful. free programs, which will help you select and analyze keywords when selecting queries.

    Indicates to users the value of information, products or services, and improves overall brand perception. This is a broad concept that includes not only the quality of the content posted, but also the overall corporate style of the site, the exclusivity of the service or product provided.

    Aspects of unique presentation of material

    The concept of uniqueness is widely used on the Internet and is attributed mainly to the quality of texts. Texts must be exclusive, written to order, and words that are repeated in separate blocks of text should not coincide with other search results.

    Aspects of creating unique content include:

    1. Creating uniqueness for search engine robots.

    Working with low-frequency queries is well described in the material on.

    Afterword

    The uniqueness of the content should not be based on simple technical uniqueness, which is determined by the mechanisms of search engines - the content should be based on the actual needs of site visitors and tuned out from their real demand. Study the audience, analyze behavioral aspects, regularly try new options for presenting content: an attentive attitude towards the audience will ensure their reciprocal attention to your resource.

    Sincerely, Nastya Chekhova