What a tangled web we weave, indeed. About 40 percent of the world's population uses the Web for news, entertainment, communication and myriad other purposes [source: Internet World Stats]. Yet even as more and more people log on, they are actually finding less of the data that's stored online. That's because only a sliver of what we know as the World Wide Web is easily accessible.
The so-called surface Web, which all of us use routinely, consists of data that search engines can find and then offer up in response to your queries. But in the same way that only the tip of an iceberg is visible to observers, a traditional search engine sees only a small amount of the information that's available -- a measly 0.03 percent
As for the rest of it? Well, a lot of it's buried in what's called the deep Web. The deep Web (also known as the undernet, invisible Web and hidden Web, among other monikers) consists of data that you won't locate with a simple Google search.
No one really knows how big the deep Web really is, but it's hundreds (or perhaps even thousands) of times bigger that the surface Web. This data isn't necessarily hidden on purpose. It's just hard for current search engine technology to find and make sense of it.
There's a flip side of the deep Web that's a lot murkier -- and, sometimes, darker -- which is why it's also known as the dark Web. In the dark Web, users really do intentionally bury data. Often, these parts of the Web are accessible only if you use special browser software that helps to peel away the onion-like layers of the dark Web. Read More...
Deep Web Search Engines
From the United States Government
Searches scientific content
Searches for images
Deep Web Search Tools
If you do nothing else with the deep Web, learn how to use the three websites described below.
CompletePlanetTM uses a query based engine to index 70,000+ deep Web databases and surface Web sites. Appendix A lists 60 of the largest deep Web databases which contain 10% of the information in the deep Web, or 40 times the content of the entire surface Web. These 60 databases are included in CompletePlanets indexes. CompletePlanet is sponsored by BrightPlanet®Corporation, a leader in deep Web searches. The interface is intuitive and easy to use. You can do a keyword search on all 70,000+ databases to find which databases to use for your search. You can also browse by category, and then search databases of interest.
ProFusion is a combination of query based engine and a deep Web directory portal. The directory structure is accessed by clicking on Specialized Searches. With an account, you can setup custom My Search Groups to search customized lists of websites and/or databases of your choice. For example, you could create a group called Technology and add all the databases and websites of interest to you. This group is saved to your profile. You could then, at any future time, search this group on a research topic with keywords. This is a great time saver. Their query based engine is called SmartDiscovery®.
SurfWax also uses a site's existing search capability as part of the meta-search process to tap the deep Web. They use proprietary algorithms to interpret the site's search criteria (Boolean, etc). With an account, you can also setup customSearchSets to search customized lists of websites and/or databases of your choice.Surfwax also has a news accumulator feature with over 50,000 news topics in 84 categories. This news accumulator feature is a godsend providing high quality results. These are some useful news accumulator categories: all topics, networking,technology, telecommunication, and web services. In addition this site has WikiWaxwhich takes the online encyclopedia Wikipedia to the next level. WikiWax does advanced look-aheads on Wikipedia searches to speed your keyword choices.
Metasearch engines search several search engines simultaneously and combine the results. In theory it might seem you get broader coverage in this way. In practice, you loose precision because some metasearch engines cannot pass Boolean operators and most of the syntax does not work from the original engine. These are popular metasearch engines: