整点薯条

没有薯条的码头毫无意义。

0%

Google API Client 设置 Proxy

项目需要使用 YouTube Data API v3 抓取 YouTube 视频中的评论,google 提供的 python 代码中使用了 google-api-python-client 包,但是文档中并未说明如何启用代理,经过一上午的各种尝试,终于在读了源码之后成功使用代理访问 API 接口。

失败尝试

用 requests 访问接口

既然 client 没有提供代理访问,那我就不用这个包了,直接用 requests 构造请求访问 API,然而,很遗憾,不是 SSL 错误就是返回 403。

设置 socket global proxy

issue #569

Hello, you know the google APIs are blocked in China, so we can only access these APIs by proxy, but I don’t know whether the python client support proxy setup? If yes, please tell me how. If no, could you please add the proxy feature? Thanks!

However, I can set up a global proxy manually by this way, but it a global proxy, other requests would go through with the proxy, which is unnecessary.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import socket
from httplib2 import socks
import google_auth_oauthlib.flow

Socks5 proxy
socket.socket = socks.socksocket
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 1086)

flow = google_auth_oauthlib.flow.Flow.from_client_secrets_file(
CLIENT_SECRETS_FILE, scopes=SCOPES)
flow.authorization_url(
# Enable offline access so that you can refresh an access token without
# re-prompting the user for permission. Recommended for web server apps.
access_type='offline',
# Enable incremental authorization. Recommended as a best practice.
include_granted_scopes='true')

我的 socket 连接会 timeout,估计和代理服务器有关系吧,我也没法改,对 socks5 也不了解。

成功方法

既然都不行,就去看源码实现吧,要发送请求肯定会调用 urllib3 之类的包。

定位到 googleapiclient.discovery.build()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def build(
serviceName,
version,
http=None,
discoveryServiceUrl=None,
developerKey=None,
model=None,
requestBuilder=HttpRequest,
credentials=None,
cache_discovery=True,
cache=None,
client_options=None,
adc_cert_path=None,
adc_key_path=None,
num_retries=1,
static_discovery=None,
always_use_jwt_access=False,
):

"""Construct a Resource for interacting with an API.

Construct a Resource object for interacting with an API. The serviceName and
version are the names from the Discovery service.

Args:
serviceName: string, name of the service.
version: string, the version of the service.
http: httplib2.Http, An instance of httplib2.Http or something that acts
like it that HTTP requests will be made through.

这里就很明显了,我们需要注意的是 http 这个参数。

http: httplib2.Http, An instance of httplib2.Http or something that acts like it that HTTP requests will be made through.

解决方法也很显然了,传入一个设置了代理信息的httplib2.Http实例即可。

1
2
3
4
5
6
7
import httplib2

proxy_info = httplib2.ProxyInfo(proxy_type=httplib2.socks.PROXY_TYPE_HTTP, proxy_host="127.0.0.1", proxy_port=10809)
http = httplib2.Http(timeout=10, proxy_info=proxy_info, disable_ssl_certificate_validation=False)

youtube = googleapiclient.discovery.build(
api_service_name, api_version, developerKey=DEVELOPER_KEY, http=http)