≡ Menu

The Beginner’s Guide To Semantic Tagging

semantic tagging

Semantic Tagging or markup is HTML which provides meaning as well as presentation to a web page. Why is it important? It helps in facilitating communication and finding information. When you add semantic tags to your blog post or documents, you are actually providing more information about your post. For example, <h1> tag which is usually the title of most blog posts indicates that the enclosed text is a headline 1. This is semantic as well as presentational because both the user and the browser know that it is a headline tag. However, in HTML4, tags such as bold <b> and italic <i> are not semantic since they do not provide any meaning to the text; instead, they only show how the text should be presented that is bold or italic. This is why <strong> and <em> (semantic tags) are preferred over <b> and <i>.

Another simple example would be: when you want to increase the font size of a text in your blog post, you do not use the <h1> or the <h6> tag. Instead, you use CSS styling, font-weight and font-size. This is the correct usage or semantic tagging.

Today’s post is more informational than instructional. Knowing the meaning of semantic tagging, HTML and the semantic web is quite important for bloggers and content producers. I do not have a single post on this topic and I am glad that Jamaica decided to write a guest post on this. Let me know what you think in the comments below.   

Once upon a time, computers were non-existent. Everything was manual labor. No word processors, just typewriters. No cloud-based storage just files folders – and these files weren’t comfortably USB-sized. They were stacks upon stacks of files – documents, pictures, graphs, accounting sheets and all those boring information. One can only imagine how hard it is to find a document from 1981 when you’ve got an entire room of files from 1978-1992. The tedious task of finding, identifying and combining information is what semantic tagging wants to help us with.

Fast forward to today. Thank heavens that we do have computers and that our contemporary technology can help us find information in milliseconds, help us standardize formats with HTML5 conversion and deliver cloud-based solutions. However, according to Tim Berners-Lee who coined the term “semantic web”, we can still improve how we can find information. To be able to understand semantic tagging, let’s first take a look at what it’s striving for – a semantic web.

Tim Berners-Lee had a vision for our web, he expresses:

I have a dream for the Web in which computers become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A “Semantic Web”, which makes this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The “intelligent agents” people have touted for ages will finally materialize.

To put this in context, let’s take a look at how the World Wide Web is basically written in Hypertext Markup Language. HTML does not necessarily delve into the “semantics” of code, just its visual rendering. For instance, in an e-commerce website HTML cannot really assert that product number “J768 is a Kindle with a retail price of $73”. HTML only renders that the picture of the Kindle should be beside $73 so that the viewer will know that the item is worth that much. In this way, the web has mostly been written only to be understood by humans. In the example given, HTML cannot categorize that the Kindle is a product while $73 is a price. It is only understandable to us humans because we see a price tag beside the item that’s why we know that it is the item’s value. But web browsers don’t see it that way – only that there is a value of “Kindle” and a value of “$73”.

html5 semantic tags

via w3schools

If the web were to be written with semantic tags, it’ll be easier to Google something. For example, if Google understood what I was looking for, instead of finding a closer keyword match to the phrase I typed, the results would be more relevant. For instance, if I asked, “how much is a Fujifilm X100s in Paris?” Instead of showing me articles where “Fujifilm” “X100s” and “Paris” are mentioned, the results could show me e-commerce sites that sell the item and tell me the price in Euros – because that is what I’m looking for.

This is just one of the values of semantic tagging. Perhaps the concept of semantic tagging will be clearer if we apply it to another context –an e-library perhaps. Let’s say this museum had a collection of 800,000 different kinds of media/material – books, magazines, journals, newspapers etc. Semantic tagging will help both visitors and archive heads find items more quickly, and results will show more relative data. For example, the collection has been tagged according to author, publisher, year, category, subject, nationality, awards, users can refine their search.

The idea of a semantic web using semantic tagging is revolutionary. If we can get machines to understand humans, then we can do more with the next technology. However, the challenges to a semantic web are enormous, literally and figuratively. There are 5 recognized challenges to this vision of the Web:

  1. Vastness – this is the World Wide Web we are talking about, that means a whole new univese composed of billions of pages. It will take colossal efforts to get machines to understand that much information.
  2. Vagueness – human language is full of variation. For example, “how much” can be used in many different ways, like asking the price, or amount or even extent. It is hard to teach machines to understand vague user queries without proper context.
  3. Uncertainty – these are precise concepts with uncertain values. However, defeasible reasoning and paraconsistent reasoning are two techniques which can be used to eliminate uncertainty.
  4. Inconsistency – these are logical contradictions. An example of which would be to ask “which comes first the chicken or egg?” answers may vary and there is no correct or incorrect one.
  5. Deceit – this is when the producer of information is intentionally misleading the consumer of the information.

One can see that the challenges of improving the web through semantic tagging are colossal. There is a gap between human intelligence and machines that may never be reconciled. But who knows whether the dream of a semantic web is impossible or not in the near future.

unsemantic vs semantic

via slideshare

Salman: We humans always try to find better ways to communicate both visually and verbally. We care about the design, layout of our content etc. Why? Because we want to give our audience a better experience when they are on our website. But how about the code behind the website? What do they mean? With semantic html and tagging, you can make the code more meaningful. Semantic tagging is important because it’s clean, more accessible and search engine friendly.

I am including a list of working semantic tags as well as some resources. Hope you find them useful :) Although most HTML4 and HTML5 tags have semantic meaning, there are some exclusive tags that are primarily semantic in nature.

Semantic Tagging and HTML chart

HTML5 Semantic Tagging List

Check out the following short pdf files which will give you more details about semantic tagging and html.

Are you a semantic tagger? Let me know your thoughts on semantic HTML.

This is a guest post written and contributed by Jamaica Sanchez is a Website Auditor with solid experience. She has been an advocate of cloud computing for improved work efficiency and performance. She also has a passion in dancing, cooking and playing golf. Follow her on Twitter or Google+. The author’s views above are entirely his or her own and may not reflect the views of Mastermind Blogger.

Comments on this entry are closed.