ERROR: Could not open CONNECT tunnel

论坛 期权论坛 脚本     
已经匿名di用户   2022-4-20 22:05   1822   0

Landon Campbell

Landon Campbell
Email: c***@hotmail.com
Posts: 4 Find Posts
Threads: 2 Find Threads

11 months ago

Permalink

Raw Message

Report

Hi,

Pretty new to Scrapy, so forgive me if this is obvious. We're running
Scrapy 0.24.2 (under Portia/Slybot), with ProxyMiddleware enabled and a
fairly large pool of proxies. Any time I request an HTTPS URL, I recieve a
"Could not open CONNECT tunnel" error, which ultimately causes the spider
to close. In my development environment, I'm running Scrapy 0.24.4
(Portia/Slybot), through the same proxies, and I do NOT have this problem.
Is this simply a Scrapy version issue, or is it something else? Can't
figure out why it's OK one place but not the other. Any thoughts would be
appreciated.

Thanks,
Landon

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+***@googlegroups.com.
To post to this group, send email to scrapy-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Travis Leleu

11 months ago

Permalink

Raw Message

Report

Why don't you upgrade to 0.24.4 on your production environment?

...

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+***@googlegroups.com.
To post to this group, send email to scrapy-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Landon Campbell

11 months ago

Permalink

Raw Message

Report

Upgrading is an option, but I prefer to know *why* something is happening.
If this is a known issue that's been fixed, great. Otherwise, if anybody
has an explanation, that would be appreciated.

...

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+***@googlegroups.com.
To post to this group, send email to scrapy-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Daniel Fockler

11 months ago

Permalink

Raw Message

Report

I've generally seen this error on sites that are using SSL. I'm not sure
about the specifics, but it's because the SSL handler in Scrapy can't
manage the connection with whatever site you are working with.

...

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+***@googlegroups.com.
To post to this group, send email to scrapy-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Travis Leleu

11 months ago

Permalink

Raw Message

Report

Are you running through a proxy? IIRC, there is some funkiness when trying
to connect via https when your proxy is an http-only proxy.

I use crawlera, which has an alternative endpoint (you connect via http to
crawlera, pass the encoded https url, and the proxy connects via https to
the target server). You may need to configure to do http to your proxy,
https from your proxy to the target server.

Without more specifics of your situation, I'm afraid that's all the help I
can give. You might try and make sure all your SSL type libraries are
up-to-date, as I've run into errors when out of date libs prevent the SSL
handshake, borking everything.

...

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+***@googlegroups.com.
To post to this group, send email to scrapy-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Landon Campbell

11 months ago

Permalink

Raw Message

Report

Travis,

Yes, we are using proxies, about 100 of them, but I don't *think* that's
the issue, as I'm able to crawl these sites successfully using those
proxies from my local Ubuntu. I think your point regarding SSL type
libraries is promising, but being new to Python, I'm not sure which
libraries those would be. Do you have any suggestions for which libraries I
might investigate?

Thanks,
Landon

...

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+***@googlegroups.com.
To post to this group, send email to scrapy-***@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.


转载于:https://my.oschina.net/airship/blog/628812

分享到 :
0 人收藏
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

积分:81
帖子:4969
精华:0
期权论坛 期权论坛
发布
内容

下载期权论坛手机APP