# A TAG to ban all the web-browsers in .htaccess

I'm looking for a tag to ban (only in a specific directory) all the
web-pages browsers with exception to .xml ones (RSS feeds).

So firstly I need to understand if the .htaccess banning could act in
only a specific directory and how.
Secondly I need to ban all the web page browsers to stop crawlers.
To succed in this purpose I thought to aspecific TAG, but gratefull, as
well, could be some other better solution...
Thanks

## Re: A TAG to ban all the web-browsers in .htaccess

user45678xy wrote:

.htaccess rules can run on a single directory hierarchy, but I'm not
sure that's the right way to go.  If all you want is to ban google,
etc., look into robots.txt.  The well behaved crawlers will follow it;
the non-so-well-behaved ones will not, but then they can emulate another
browser anyway.

## Re: A TAG to ban all the web-browsers in .htaccess

Jerry Stuckle ha scritto:

Yes, I know this

So the only solution is IP banning for the main harvesters?
And how can I specify to Apache, by means of .htaccess, a certain IP
banning act only in a specific directory?
Thanks

## Re: A TAG to ban all the web-browsers in .htaccess

user45678xy wrote:

You miss the point.  The well-behaved ones will honor robots.txt.  The
not-well-behaved ones will emulate another browser, so your .htaccess
won't help anyway.

But if you insist on doing it the hard (and incorrect) way, try
alt.apache.configuration.  They have some experts on it there.

## Re: A TAG to ban all the web-browsers in .htaccess

Jerry Stuckle ha scritto:

I thank you very much.
Bye

## Re: A TAG to ban all the web-browsers in .htaccess

On Tue, 10 Feb 2009 08:01:52 -0500, Jerry Stuckle put finger to
keyboard and typed:

Eh? You can exclude by IP using .htaccess, and that's entirely
reliable unless the bot manages to spoof its IP address (or move to a
different one). I think you're misreading the OP's question and
assuming that he wants to deny by robots.txt.

Mark
## Re: A TAG to ban all the web-browsers in .htaccess

Mark Goodge wrote:

Please read his request again.  He wants to block by user agent, not ip

## Re: A TAG to ban all the web-browsers in .htaccess

Mark Goodge wrote:

Also, trying to block misbehaved robots by ip address is worthless.
They'll just switch to another proxy.

## Re: A TAG to ban all the web-browsers in .htaccess

Jerry Stuckle ha scritto:

Yes, but I wanted to know if it was any .htaccess tag to prevent
user-agents (and if necessary all the browsers) enter some specific
directories.

If you ban an IP, of course, this can be spoofed, but not all the bots
act this way and if I is not possible to stop every user-agent, could
also be useful to prevent some IPs to access a certain directory, but I
don't know how to get it.

Anyway I will try to ask this in the 'Apache' discussion groups, but
probably I can't always undestand their answers, being a more technical
group. My web site is in an external server and I can't set apache, but
only .htaccess file.

## Re: A TAG to ban all the web-browsers in .htaccess

user45678xy wrote:

They're a pretty good crowd.  I'm sure some of their answers will be

## Re: A TAG to ban all the web-browsers in .htaccess

user45678xy wrote:

The only way to do this is that you can only deny all and then
agents that have "xml" will mean that any email scraper or bot can loop
through any number of agent values until they have access and then all
allow the IP or IP range of requesters you trust (you can then add the
xml requirement on top of that for the agent field) and you should
probably be okay.  Banning every bot IP after it's done the damage is a
pointless task and the agent field alone isn't strong enough if you
really care.  If it's not a big deal, then try one first and see how
well it works.
## Re: A TAG to ban all the web-browsers in .htaccess

Tim Greer wrote:

files other than .xml (not XML in the agent field). Is that right?  If
so, simply deny access to all files other than .xml files (maybe unless
they are from a specific IP or IP range).
## Re: A TAG to ban all the web-browsers in .htaccess

Tim Greer ha scritto:

I'm sorry there was an error and I'didn't explain this because
previously it was not so important but, really, my purpose was to stop
some operators on the basis of their user agents, especially in their
access to .xml files, that are placed, on my web-site, in a specific
directory.

Till now, I have understood, I can prevent some user agents, to access
in a specific directory, by listing all their IPs (I know this could
work also grouping all the IP of a certain operator -> deny from
xxx.xxx), but as you were explaining it is a little complex and I don't
know if is there also, a consequently considerable slow-down of the server.

So what can be useful again, in my mind, is to prevent the access on the
basis of something more specific, like those user-agents' tag, inserted
in the robots.txt file, but not respected by all the bots in that case.

Acting in the way you were telling me, I think could be useful to list
allowed user-agents instead of their IP ranges (this really is not what
interest to me, because I need to allow common users to access RSS and
not the bots).

## Re: A TAG to ban all the web-browsers in .htaccess

user45678xy ha scritto:

I'm sorry I wanted to tell this:

Acting in the way you were telling me (allowing instead of denying), I
think could be MORE useful if I could list allowed user-agents, instead
of their IP ranges.
But this, really, is not what interest to me, because what I need is to
allow common users to access RSS and deny the bots.

## Re: A TAG to ban all the web-browsers in .htaccess

On 2009-02-10, user45678xy wrote:

If the server is Apache with the mod_rewrite module, you can do
something like this in a .htacess file:

RewriteEngine On
RewriteCond  %  ^Pingdom               [NC,OR]
RewriteCond  %  bdbrandprotect         [NC]
RewriteRule  ^.*$- [F,L] That stanza will return a 403 ("F" for forbidden) to any request with a user-agent that matches ("NC" for not case-sensitive) either of those patterns. You can make it much more complicated; here are some links that go into more detail. http://corz.org/serv/tricks/htaccess2.php http://www.javascriptkit.com/howto/htaccess.shtml HTH. -- Nobody ever went broke underestimating the taste of the American public. [Mencken] ## Re: A TAG to ban all the web-browsers in .htaccess Adam Funk ha scritto: I thank you very much! ..is a little complex, but very interesting sites to deepen .htaccess. ## Re: A TAG to ban all the web-browsers in .htaccess @news.ducksburg.com: You may also take Adam's example and add multiple conditions. EX. Either of the following UA's and Specific IP range: RewriteEngine On RewriteCond % ^Pingdom [NC,OR] RewriteCond % bdbrandprotect [NC] RewriteCond % ^123\.456\.789\. [OR] RewriteCond % ^234\.456\.789\.1([0-9][0-9])$
RewriteRule .* - [F]

Htaccess may be quite effective.
If one takes the time to comprehend "regex", simultaneously, a single
synatx error may result in a 500 Error preventing visitor access to your
entire site.

Some years back the general use of htaccess by many webamsters was in
"stated" denials, which is an ongoing process.
Today the general trend is in "whitelisting", letting in those that you
desire (SE's, visitors, refers, IP's, Browsers and more) and denying
everybody else.

Harvesters utilize and learn the same as webmasters, thus many methods
utilize are not expressed in open forums (why help the harvesters
understand an effective method).

## Re: A TAG to ban all the web-browsers in .htaccess

On 2009-02-14, Don wrote:

That's an important warning, which I forgot to mention.  Thanks for
bringing that up.  (Yes, I have made this mistake.)

It's also worth pointing out that not all Apache installations provide
mod_rewrite, although I think most do.

## Re: A TAG to ban all the web-browsers in .htaccess

On Tue, 10 Feb 2009 13:11:37 +0100, user45678xy put finger to keyboard
and typed:

You can use user-agent banning or referer banning (a lot of bots have
distinct patterns you can use in that respect). But IP banning is the
simplest and most reliable.

<Directory /this_directory>
Order Allow,Deny
Allow from all
Deny from 123.123.123.123
</Directory>

where 'this_directory' is the one you want to restrict access to, and
the IP address is whatever the bot's address is. You can have multiple
Deny lines in .htaccess, one for each bot, if necessary.

See http://httpd.apache.org/docs/1.3/mod/mod_access.html for more
details (that documentation is for Apache 1.3, but the syntax is the
same for 2.0).

Mark
## Re: A TAG to ban all the web-browsers in .htaccess

Mark Goodge ha scritto:

I know this, what I was trying to understand if it was a common tag to
stop all the browsers except for '.xml' pages.

This is the other thing I was trying to find, so that I can prevent only
some directory to be visited.

This is very interesting too.

Thank you very much for your help, Mark,
Bye