是什么原因导致"的urlopen错误[错误13]许可被拒绝"错误?

问题描述:

我想写一个Centos7服务器上的蟒蛇(版本2.7.5)CGI脚本。
我的脚本试图从LibriVox的的网页上下载喜欢的数据... ... https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/$c$c>然后我的脚本炸弹出这个错误:

I am trying to write a python (version 2.7.5) CGI script on a Centos7 server. My script attempt to download data from librivox's webpage like ... https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/ and my script bombs out with this error:

<class 'urllib2.URLError'>: <urlopen error [Errno 13] Permission denied> 
      args = (error(13, 'Permission denied'),) 
      errno = None 
      filename = None 
      message = '' 
      reason = error(13, 'Permission denied') 
      strerror = None

我有停机时的iptables 我可以做这样的事情'的wget -O-的https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/'没有错误。这里是code的位都发生错误:

I have shutdown iptables I can do things like `wget -O- https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/' without error. Here is the bit of code were the error occurs:

def output_html ( url, appname, doobb ):
        print "url is %s<br>" % url
        soup = BeautifulSoup(urllib2.urlopen( url ).read())

更新:感谢保罗和alecxe我已经更新了我的code是像这样:

Update: Thanks Paul and alecxe I have updated my code to be like so:

def output_html ( url, appname, doobb ):
        #hdr = {'User-Agent':'Mozilla/5.0'}
        #print "url is %s<br>" % url
        #req = url2lib2.Request(url, headers=hdr)
        # soup = BeautifulSoup(urllib2.urlopen( url ).read())
        headers = {'User-Agent':'Mozilla/5.0'}
        # headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36'}
        response = requests.get( url, headers=headers)

        soup = BeautifulSoup(response.content)

...我得到一个稍微不同的错误,当...

... and I get a slightly different error when ...

response = requests.get( url, headers=headers)

...被调用...

... gets called ...

<class 'requests.exceptions.ConnectionError'>: ('Connection aborted.', error(13, 'Permission denied')) 
      args = (ProtocolError('Connection aborted.', error(13, 'Permission denied')),) 
      errno = None 
      filename = None 
      message = ProtocolError('Connection aborted.', error(13, 'Permission denied')) 
      request = <PreparedRequest [GET]> 
      response = None 
      strerror = None

......有趣的是,写这个剧本的命令行版本,它工作得很好,并期待这样的事情...

... the funny thing is wrote a command line version of this script and it works fine and looks something like this ...

def output_html ( url ):
        soup = BeautifulSoup(urllib2.urlopen( url ).read())

很奇怪你不觉得吗?

Very strange don't you think?

更新:
这个问题可能已经有一个答案在这里:
urllib2.HTTPError:HTTP错误403:禁止2答案

Update: This question may already have an answer here: urllib2.HTTPError: HTTP Error 403: Forbidden 2 answers

NO他们不回答这个问题

终于找到它了...

# grep python /var/log/audit/audit.log | audit2allow -M mypol
# semodule -i mypol.pp