A Basic Understanding of Web Research
Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information upon it. — Samuel Johnson
The importance of the above statement cannot be overemphasized, more so in the age of Internet that has revolutionized the way we approach information sources, making instant access to data a possibility. Be it a CEO of a company looking up for information for his business presentation or a research scholar performing content search for his potential project, the Internet seems to offer solutions to all. We need only to know where it exists. Internet research involves optimum utilization of search engines and evaluating the output. In this context, the ability to conduct effective web research in order to access the right information comes in handy.
Searching the Net requires a patient, positive attitude and the willingness to keep trying until desired results are obtained. Becoming a good researcher takes insight, experience, and perseverance along with luck. A lot of practice is needed to become a proficient researcher. The goal of research is to find pertinent information quickly so as to avoid the pitfalls of too much/too little data by efficiently designing research strategies.
Understanding search engines
The most popular form is the search engine, which looks for information on the public World Wide Web. Thanks to search engines such as Google, Yahoo Search, AltaVista, Excite, MSN, Ask Jeeves, HotBot, AllTheWeb, WiseNut and Teoma, the task of finding information on anything has become relatively easy and fast. Other kinds of search engines include enterprise search engines that look within intranets, personal search engines, and mobile search engines. Then there are metasearch engines that query other search engines that combine the results received from all. In effect, the user is not using just one search engine but a combination of many search engines at once to optimize Web searching. Metasearch engines usually differ from general search engines in a sense that they do not have database of their own but draw data from several of search engines. Examples of metasearch engines include Mamma, Metacrawler, Dogpile and Excite.
There are also search engines for specialized online databases, catering to a particular subject area, for instance, Lexis Nexis, which provides largely legal information, Factiva for business information, and Medline catering to medical and health fields. Depending on one’s requirement, relevant search engine or database can be used.
Features of Search Engines
Though the logic adopted by each search engines generally varies, they do have certain common features. More features are usually found in “Advanced search” option of the search engines.
Use of Boolean Operators
Search engines generally support Boolean Operators AND, OR and NOT. Using the Boolean operator OR between the search terms helps in expanding search results. This is resorted to when the search output is huge. The operator AND helps in narrowing the search results, and this is done when the search output is small. The operator NOT helps in ensuring that the results do not contain the search term following it.
Use of Thesaurus
Search engines of specialized databases support the use of thesaurus that consists of terms arranged hierarchically or controlled vocabulary that requires the use of predefined authorized terms. This feature is usually seen in specialized databases catering to a particular subject area. Use of thesaurus is believed to reduce irrelevant items in the search results. In recent years, as opposed to thesaurus or controlled vocabulary, free text search involving natural language indexing with no restriction on the vocabulary has become more popular.
Wild card Symbols and Truncation
Wild card symbol, a feature of many databases, is used to take care of the variations of spelling within a keyword, particularly while dealing with British and American spelling, and so is the truncation that is used to take care of various forms of a keyword. Since different search engines use different symbols to represent wild card or truncation symbols, it may be a good idea to go through the user manual provided by each search engine to understand how it functions. Other features that are commonly available with most of the databases include phrasal search, option to re-run saved searches and “alert” services.
Evaluating the search output
Though search engines have made the job of finding the “right” information relatively easy, it helps to judge for ourselves whether we have acquired the “right” information. The key in getting the ‘right’ information lies in one’s ability to separate “wheat from the chaff”, as they say. In order to appreciate the authenticity or genuineness of websites, a basic understanding of the working of websites and the Internet helps. It may be remembered that not all websites offer authentic information. Suppose you are looking on the Internet for information on the Nobel laureate Prof. Amartya Sen through Google, you will notice that the search output gives you a number of links. Do all these links give authentic information? The answer is no. Of the links, the professor’s university website http://post.economics.harvard.edu/faculty/sen/sen.html, hosting his profile is considered authentic. It is not difficult to see why. The university holds responsibility for any content that it hosts on its website, and hence runs validation on all the content on its website. Also, the profile of the professor on the official Nobel Foundation website is also considered authentic. However, the same cannot be said of other websites, which may also give considerable information on the professor. The lack of validation and the presence of outdated information are some of the reasons why we cannot consider information from all websites. It also helps one to go through links such as ‘About Us’ or ‘Who we are’ on a website to understand who is hosting the information in order to gauge the authenticity of the information.
Primary and Secondary Sources
Likewise, when one looks up on the Net for some information on the university or college from where he/she has graduated, chances are that the search output throws a number of seemingly relevant links in the search output. The first one will usually be the official university website. These official websites are referred to as primary sources. There are other websites such as alumni websites, which are not considered authentic because either the information they host is not validated or it may not be current. There are also what are known as aggregators, who compile information on the university and make it available on their websites. Any website that contains information on an organization other than its official website (primary website) is referred to as a secondary website. The information on the primary websites is always considered authentic. It helps to have an understanding of the difference between primary and secondary sources in web research.
Secondary websites, even if they contain the latest information but not authentic, are not reliable. As pointed out earlier, primary sources run enough validation before posting any information on their websites. And, the same cannot be said of secondary sources. One more common reason is that the secondary websites may not be updated, and they may continue to host old information. Checking the date when a website was last updated usually helps. However, in web research, secondary sources can be used as leads to arrive at primary websites. For example, you are looking up for some information on a book. What Amazon.com tells you about the book is secondary information, and hence is a secondary source, and what the publisher of the book has to say about the book is considered primary information.
Various Types of Websites
It also helps to know the extensions of the domain name of a website such as .gov, .com, .net, .org, .edu, .person and so on. In general, the following domain extensions are seen in the URL of the websites:
gov: government
com/.net: company/individual
org: organization
edu: education (university)
person: individual
There are also extensions such as, say .co.uk. .co.in,.co.nz etc. Example: www.bbc.co.uk. The last portion of the extension is for the country. The standard two-letter country codes are used for the purpose.
Using Online Sources
There are certain websites that give information on the registrant of a given website like www.whois.sc. There are also websites that help one understand the traffic to a website www.alexa.com. Apart from sound Web surfing skills, a basic understanding of the functioning of the websites helps one in gathering authentic information. There are also several online sources that serve as quick reference tools like the www.answers.com.
Access to documents
Having identified the information of one’s interest, the next step is to gain access to the documents. Access to necessary documents is not always free. The content on some of the websites is free, while some websites charge for the content they host. Sometimes, the content offered is partly free. For example, the publishers of scholarly journals offer free abstracts on their websites, while access to a full-text article requires subscription. Publishers offer several methods of access. Some require full-year subscription, which needs to be renewed every year, some also charge for individual documents.
Copyright issues
The online content more often than not is governed by copyright regulations. So, while using the content, the necessary steps needs to be taken, this usually requires acknowledging the sources of information.