Internet Research Techniques
Internet Research Challenges
Finding information on the Internet that is relevant, useful, current and credible can be challenging.
Information on the Internet is:
- decentralized - thousands of networks are involved
- disorganized - no central index or database exists
- dynamic - changing every minute of every day
- expanding rapidly
- not subject to traditional pre-publication checks and balances
- not always authentic or accurate
- not always predictable - resources can disappear or change suddenly
Searching tools and techniques are:
- numerous and varied
- not standardized
- constantly changing
Cataloging Internet Resources
Thousands of organizations are engaged in cataloging internet resources. Addresses and descriptions of sites are collected in databases which can then be searched by users. These databases are developed by either:
- Human collection, review and classification of internet sites. This method usually results in smaller but higher quality databases because of the application of human judgement.
- Automated collection and classification of internet sites using programs called spiders or crawlers. This method produces larger databases but often lacks the quality found in human-generated databases.
Searching by Browsing
In this method of searching the search page presents several topics and sub-topics. Users search by selecting a topic, then a sub-topic and continue "drilling down" until the required information is found. The browsing method is often used for broad searches.
Try this method of searching at Yahoo. Select the topic Regional, then Countries, then Canada and continue making selections until you reach the category for Calgary.
Searching by Keyword
In this method of searching:
The user enters keywords in a query box and requests a search. Some search tools place more emphasis on the first keyword, assuming it is the most important.
- The search tool attempts to match the keywords with entries in its database then returns a "hit list" of sites related to the keywords. The sites in the hit list are usually ranked by relevance with the best matches at the top of the list. The information for each site includes a link to the particular internet resource and in many cases a brief abstract of the site.
- The user selects appropriate sites from the hit list and reviews the pages to find the information required. The keyword searching method is often used for narrow, specific searches.
Example:
Hit List
Site 1 - link and description
Site 2 - link and description
Site 3 - link and description
etc.
Improving Keyword Search Results
Search results can usually be improved by using search operators. These operators help the search tool select better matches from its database. Some search tools recognize many different search operators - the user should consult the search tool's HELP page for more information. Three search operators are so widely used they are practically universal.
- +word - hit list includes sites that contain this word
- -word - hit list excludes sites that contain this word
- "phrase" - hit list includes sites that contain this exact phrase (multiple words are treated as a single word)
Examples:
- hit list includes sites containing the word "topaz" and excludes sites containing the word "gem"
- hit list includes sites containing the exact phrase "Robin Hood"
- hit list includes sites containing the exact phrase "Robin Hood" and excludes sites containing the word "flour" (to exclude sites about the Robin Hood Flour Company)
Try using this method of searching to answer the following questions with Google:
- Who is the Mayor of Oshawa? ANSWER
- What is the third question in the Four-Way Test? ANSWER
- What was the real name of the legendary figure known as "Grey Owl"? ANSWER
Boolean Search Operators
Boolean operators can be used to define the relationship between multiple keywords. Not all search engines recognize these operators. Check the HELP section of the search engine to determine which operators can be used as part of the query term.
AND Operator
- Include resources that contain all keywords
- Used to narrow or tighten a search
- Example:
OR Operator
- Include resources that contain either or both keywords
- Used to broaden a search
- Example:
NOT Operator
- Exclude resources that contain the keyword
- Used to narrow a search
- Example:
Parenthetical phrases can be used to create more complex query terms consisting of multiple boolean operators.
Example:
Proximity Search Operators
Proximity operators can be used to specify the relative location of keywords. Not all search engines recognize these operators. Check the HELP section of the search engine to determine which operators can be used as part of the query term.
ADJ Operator
- Keywords must occur beside each other, but may be in either order
- Example:
BEFORE Operator
- Find two keywords, one of which occurs before the other
- Example:
NEAR Operator
- Find two keywords that are within a specific number of words (or less) from each other in either direction
- Example:
FAR Operator
- Find two keywords that are at least 25 words (or more) from each other in either direction
- Example:
Truncation Search Operator
Truncation locates resources that include alternate forms of a keyword. Not all search engines recognize this operator. Check the HELP section of the search engine to determine which operators can be used as part of the query term.
- Truncation, also known as stemming, is applied through the use of a wildcard character. The universal wildcard character is an asterisk (*). Some search engines recognize other characters as a wildcard.
- Truncation is used where there are multiple valid spellings of a keyword
- Example: will locate Canada, Canadian, Canadienne, etc.
Field Search Operators
Field search operators direct the search engine to look for keywords in different parts of the web page. Not all search engines recognize thes operators. Check the HELP section of the search engine to determine which operators can be used as part of the query term.
TITLE Operator
- Locates resources where the keyword occurs in the title of the web page
- Example:
URL Operator
- Locates resources where the keyword occurs in the url of the web page
- Example:
LINK Operator
- Locates resources where the keyword occurs in hyper-text links on the web page
- Example: