# robots.txt for http://brainstormsandraves.com/ # Updated 2005-10-11 14:56pm, pdt # # Excluding images from Google info: # http://www.google.com/remove.html#images # Yahoo Slurp info: # http://help.yahoo.com/help/us/ysearch/slurp/slurp-02.html # boitho info: http://www.boitho.com/dcbot.html # btbot info: http://www.btbot.com/btbot.html # Findexa Crawler: http://www.findexa.no/gulesider/article26548.ece # Swooglebot info: http://swoogle.umbc.edu/swooglebot.html # # Alphabetized list is adapted (a lot) from # WebmasterWorld.com: robots.txt # GNU Robots.txt. Used with permission. # http://webmasterworld.com/robots.txt # I alphabetized list, removed some, added some, # compressed the list to reduce file size. # See 'Put your robots.txt on a diet' # http://www.webmasterworld.com/forum93/3.htm # 2005-10-01 1327pm, pdt # # IMPORTANT: This website does NOT allow non-authorized robots. # # Many 'BAD' bots disregard robots.txt file. I've taken other # measures to keep them out, including some in the list # below that don't behave nicely. ;-) # Unfortunately, we can't rely on all bots to mind or # even look at the robots.txt file. Some of them are up # to no good or go nutso and cause drain on servers, sadly. # ############### Sitemap: http://brainstormsandraves.com/sitemap.xml # # 2005-09-23 1610pm, pdt # Yahoo! Slurp has disregarded the robots.txt Disallow # instructions, such as: # remote host: lj2437.inktomisearch.com # with IP: 68.142.251.47 # So I'm moving the info to the top here to see if it # works this way, and following their instructions. # http://help.yahoo.com/help/us/ysearch/slurp/slurp-02.html # User-agent: Slurp Disallow: /0x938n39s Disallow: /contact/ Disallow: /cgi-bin/ Disallow: /im Disallow: /img/ Disallow: /images/ Disallow: /images1/ Disallow: /inc/ Disallow: /incattack/ Disallow: /includes/ Disallow: /js/ Disallow: /js1/ Disallow: /js2/ Disallow: /mt/ Disallow: /mtcss/ Disallow: /mtcss2/ Disallow: /secret/ ## A User-agent: Alexibot User-agent: Aqua_Products User-agent: asterias Disallow: / ## B User-agent: b2w/0.1 User-agent: BackDoorBot/1.0 User-agent: BecomeBot User-agent: BlowFish/1.0 User-agent: Bookmark search tool User-agent: BotALot User-agent: BuiltBotTough User-agent: Bullseye/1.0 User-agent: BunnySlippers Disallow: / ## C User-agent: Charlotte Disallow: / User-agent: CheeseBot User-agent: CherryPicker User-agent: CherryPickerElite/1.0 User-agent: CherryPickerSE/1.0 User-agent: Copernic User-agent: CopyRightCheck User-agent: cosmos User-agent: Crescent User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0 Disallow: / ## D User-agent: DittoSpyder User-agent: dumbot Disallow: / ## E User-agent: EmailCollector User-agent: EmailSiphon User-agent: EmailWolf User-agent: Enterprise_Search/1.0 User-agent: Enterprise_Search User-agent: EroCrawler User-agent: es User-agent: ExtractorPro Disallow: / ## F User-agent: FairAd Client User-agent: Flaming AttackBot User-agent: Foobot User-agent: FreeFind Disallow: / ## G User-agent: Gaisbot User-agent: GetRight/4.2 # User-agent: Googlebot-Image User-agent: grub-client User-agent: grub Disallow: / ## H User-agent: Harvest/1.5 User-agent: Hatena Antenna User-agent: hloader User-agent: http://www.SearchEngineWorld.com bot User-agent: http://www.WebmasterWorld.com bot User-agent: humanlinks User-agent: httplib Disallow: / ## I # User-agent: ia_archiver # User-agent: ia_archiver/1.6 User-agent: InfoNaviRobot User-agent: Iron33/1.0.2 Disallow: / ## J User-agent: JennyBot User-agent: Jetbot/1.0 User-agent: Jetbot Disallow: / ## K User-agent: Kenjin Spider User-agent: Keyword Density/0.9 Disallow: / ## L User-agent: larbin User-agent: LexiBot User-agent: libWeb/clsHTTP User-agent: LinkextractorPro User-agent: LinkScan/8.1a Unix User-agent: LinkWalker User-agent: looksmart User-agent: LNSpiderguy User-agent: lwp-trivial/1.34 User-agent: lwp-trivial Disallow: / ## M User-agent: Mata Hari User-agent: Microsoft URL Control User-agent: Microsoft URL Control - 5.01.4511 User-agent: Microsoft URL Control - 6.00.8169 User-agent: MIIxpc User-agent: MIIxpc/4.2 User-agent: Mister PiX User-agent: moget User-agent: moget/2.1 User-agent: Mozilla User-agent: mozilla User-agent: mozilla/3 User-agent: mozilla/4 User-agent: mozilla/5 User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95) User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows NT) User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 95) User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 98) User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows XP) User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 2000) User-agent: MSIECrawler Disallow: / ## N User-agent: naver User-agent: NetAnts User-agent: NetMechanic User-agent: NICErsPRO User-agent: Nutch Disallow: / ## O User-agent: Offline Explorer User-agent: OmniExplorer_Bot User-agent: Openbot User-agent: Openfind User-agent: Openfind data gathere User-agent: Oracle Ultra Search Disallow: / ## P User-agent: PerMan User-agent: ProPowerBot/2.14 User-agent: ProWebWalker User-agent: psbot User-agent: Python-urllib Disallow: / ## Q User-agent: QueryN Metasearch Disallow: / ## R User-agent: Radiation Retriever 1.1 User-agent: RepoMonkey User-agent: RepoMonkey Bait & Tackle/v1.01 User-agent: RMA Disallow: / ## S User-agent: scooter User-agent: searchpreview User-agent: SiteSnagger User-agent: sootle User-agent: SpankBot User-agent: spanner User-agent: Stanford User-agent: Stanford Comp Sci User-agent: suzuran User-agent: Szukacz/1.4 Disallow: / ## T User-agent: Teleport User-agent: TeleportPro User-agent: Telesoft User-agent: toCrawl/UrlDispatcher User-agent: The Intraformant User-agent: TheNomad User-agent: True_Robot/1.0 User-agent: True_Robot User-agent: turingos Disallow: / ## U User-agent: URL Control User-agent: URL_Spider_Pro User-agent: URLy Warning Disallow: / ## V User-agent: VCI User-agent: VCI WebViewer VCI WebViewer Win32 Disallow: / ## W User-agent: Web Image Collector User-agent: WebAuto User-agent: WebBandit User-agent: WebBandit/3.50 User-agent: WebCopier User-agent: WebEnhancer User-agent: WebmasterWorldForumBot User-agent: Website Quester User-agent: WebSauger User-agent: WebStripper User-agent: WebVac User-agent: WebZip User-agent: WebZip/4.0 User-agent: WebmasterWorld Extractor User-agent: Webster Pro User-agent: Wget/1.6 User-agent: Wget/1.5.3 User-agent: Wget User-agent: WWW-Collector-E Disallow: / ## Z User-agent: Zeus User-agent: Zeus Link Scout User-agent: Zeus 32297 Webster Pro V2.9 Win32 Disallow: / ## # END OF LIST ADAPTED (A LOT) FROM WEBMASTERWOLD # 2005-09-17 18.00PM PDT ########################## # # DISALLOW COMPLETELY # From websitetips 2005-09-21, # more added from brainstorms User-agent: Abilon User-agent: aipbot User-agent: Alter Ego User-agent: arks User-agent: Bilbo User-agent: Bilbo/2.3b-UNIX User-agent: boitho.com-dc User-agent: DataFountains/DMOZ Downloader User-agent: Digger User-agent: Egress User-agent: EverbeeCrawler User-agent: Exabot-Images/1.0 User-agent: Exabot-Images User-agent: FAST Enterprise Crawler User-agent: Flashget User-agent: Flashgot User-agent: Gigabot User-agent: Gigabot/2.0 User-agent: Grub User-agent: Grub.org User-agent: ht://Dig User-agent: htDig User-agent: Harvest # User-agent: Hatena Antenna User-agent: Hatena Antenna/0.4 (http://a.hatena.ne.jp/help) User-agent: IconSurf User-agent: JoBo Java Web Robot User-agent: LinkWalker User-agent: LocalcomBot User-agent: Magpie User-agent: Miva User-agent: MJ12bot User-agent: Motor User-agent: MSProxy User-agent: MSProxy/2.0 User-agent: MSRBot User-agent: Onfolio User-agent: Onfolio/2.0 User-agent: Onfolio/2.01 User-agent: Onfolio/2.02 User-agent: Pompos User-agent: PrivacyFinder Cache Bot v1.0 User-agent: SBIder User-agent: SeznamBot User-agent: SeznamBot/1.0 User-agent: Simon User-agent: Simon/2.1 User-agent: SpiderMan User-agent: Voila User-agent: vscooter User-agent: Walhello appie User-agent: winksite User-agent: Yandex User-agent: Yandex bot User-agent: YandexSomething/1.0 Disallow: / # MY ONGOING LIST OF ALLOWABLE BOTS # ALLOWED BUT WITH EXCLUSIONS User-agent: Googlebot-Image Disallow: /*.cgi$ Disallow: /*.css$ Disallow: /*.gif$ Disallow: /*.jpg$ Disallow: /*.png$ Disallow: /0x938n39s Disallow: /contact Disallow: /cgi-bin Disallow: /im Disallow: /img Disallow: /images Disallow: /images1 Disallow: /js Disallow: /js1 Disallow: /js2 Disallow: /mt Disallow: /mtcss Disallow: /mtcss2 Disallow: /secret User-agent: Googlebot Disallow: /*.cgi$ Disallow: /*.css$ Disallow: /*.gif$ Disallow: /*.jpg$ Disallow: /*.js$ Disallow: /*.pl$ Disallow: /*.png$ Disallow: /0x938n39s Disallow: /contact Disallow: /cgi-bin Disallow: /im Disallow: /img Disallow: /images Disallow: /images1 Disallow: /inc Disallow: /incattack Disallow: /includes Disallow: /js Disallow: /js1 Disallow: /js2 Disallow: /mt Disallow: /mtcss Disallow: /mtcss2 Disallow: /secret User-agent: msnbot Disallow: /*.cgi$ Disallow: /*.css$ Disallow: /*.gif$ Disallow: /*.jpg$ Disallow: /*.js$ Disallow: /*.pl$ Disallow: /*.png$ Disallow: /0x938n39s Disallow: /contact Disallow: /cgi-bin Disallow: /im Disallow: /img Disallow: /images Disallow: /images1 Disallow: /inc Disallow: /incattack Disallow: /includes Disallow: /js Disallow: /js1 Disallow: /js2 Disallow: /mt Disallow: /mtcss Disallow: /mtcss2 Disallow: /secret # NOW THE OTHERS User-agent: Baiduspider User-agent: BigCliqueBOT User-agent: BigCliqueBOT/1.03-dev (bigclicbot; http://www.bigclique.com; bot@bigclique.com) User-agent: BruinBot User-agent: BruinBot Crawler User-agent: BSpider User-agent: btbot User-agent: Deepindex User-agent: Digimarc MarcSpider User-agent: Findexa User-agent: Findexa Crawler User-agent: FindoryBot User-agent: FindoryBot/0.8 (findory.com) User-agent: Girafabot User-agent: gsa-crawler User-agent: InfoSeek Robot 1.0 User-agent: Ingrid User-agent: Jeeves User-agent: StackRambler User-agent: Swooglebot User-agent: TITAN User-agent: Teoma User-agent: UCSD Crawl User-agent: Yahoo-MMCrawler User-agent: ZealBot Disallow: /0x938n39s Disallow: /contact Disallow: /cgi-bin/ Disallow: /im Disallow: /img Disallow: /images Disallow: /images1 Disallow: /inc Disallow: /incattack Disallow: /includes Disallow: /js Disallow: /js1 Disallow: /js2 Disallow: /mt Disallow: /mtcss Disallow: /mtcss2 Disallow: /secret # # END BRAINSTORMS ROBOTS.TXT #