Python爬取科目四考试题库的方法实现

1、环境

PyCharm

Python 3.6
pip安装的依赖包包括:requests 2.25.0、urllib3 1.26.2、docx 0.2.4、python-docx 0.8.10、lxml 4.6.2

谷歌浏览器

2、目标网站及请求分析

驾驶员考试网站

从上图中,可以看到科目四共有1487道题目,为了将所有的题目汇总到一个Word文档中,需要获取到每道题的文本和图片。
        首先,打开谷歌浏览器访问上述网站,键盘按F12,点击Network,点击左侧题目中的向右箭头,一直点击下一道题,不断发起请求,在右侧可以看到每个题目的请求链接中只有五位字符的考试码不一样,所以我们要想办法获取每道题目的考试码。

点击其中的一个请求,在Headers中可以看到题目请求的url网址和状态,在Preview中可以看到题目的相关信息。

然后用谷歌浏览器打开上述网站,然后键盘按F12,点击sources,点击左侧Page下tk.mnks.cn下的lianxitix下的文件夹,可以看到Exam的相关变量信息,其中ExamCodes就是1487道题所对应的考试码。

var ExamStatus = "200";
var ExamMsg = "";
var ExamVersion = "20210205090935";
var ExamCount = 1487;
var ExamCodes = "d704f,bdb25,2d1d4,a9eb5,b0671,9b18f,2b0c4,8f1ec,07f6d,ed8f7,a74a2,46806,faa42,a580e,4fc7a,0fb21,97686,37b6b,c9263,f1b18,79e07,e9c0b,01e28,d345f,fcb88,f1301,75642,2a3b2,f70b6,2993b,b493e,64a3d,c9fdd,21f14,53c76,ddd2d,7f5d0,f8543,ee954,533ba,dcd4a,2c15b,8ac1a,e97d7,10e03,0202c,b109d,ebf95,0f7f1,0bca9,36b59,9eefd,eb795,b2a2b,704ea,856e9,9e92c,e76a1,e3970,c36f5,5a407,ee9d9,aeb78,0a570,bdc91,f8f59,c08f0,bb45b,515d9,b347d,28b17,b3dab,bbbbd,94c53,54a92,f1e50,09587,c7890,7a9b3,354b9,e2803,78c8c,256e7,3a8cc,43137,883e8,4f318,b7641,f3192,5a496,1c1fa,7e43d,a50ed,87ad2,9ea88,c788b,95c60,ca436,007c7,a7057,cab0a,5afed,e32ce,7a961,157f8,fc304,12c14,179bc,82e35,863ac,bd2f8,5655d,62935,ff906,8eda8,bad13,8714d,af074,303f7,d4bf5,c78bc,97940,7a981,c87e2,bb685,eb46c,b14b5,7c563,7fc96,e5540,7134a,b85c7,49e6f,9d464,aba3d,60a01,e11cd,bc16b,8ff78,cd884,b2efe,79e6b,dd188,969c7,bc7d9,a856f,c8f91,bb731,45056,658d9,ba732,ebf44,5e748,33f22,58c97,4f211,94644,ee187,881a9,97ecd,6017d,c0b7e,1ad72,5c82d,8146e,3b63a,32c78,c5ac6,9a9ad,9c436,3ce2d,f7e51,b1371,28e07,e39fb,a74c9,2d870,73ccd,e5cfc,04f25,77e6e,9c259,59e10,9f83a,a8d5b,40245,6b3b1,473e8,a215a,b6313,2578c,27457,ca78c,140ef,dbab9,67aac,be529,9fa10,c158f,3e964,5d883,c996e,1e2bb,fc9c7,c096c,07512,683c0,ac2bf,981ea,54456,5f324,90575,db76b,817db,60ea8,8502e,b30e3,90414,a44b7,87ad5,d6e51,6112b,3f620,d23ee,c3f1f,5c057,aafad,63844,0b51d,51231,6679a,bc264,e1353,97cd8,50512,56ecd,610f0,205d7,ed246,c0b90,c9b5c,6d806,48695,961e8,dec7c,71b2c,6a3c6,02b1d,8635f,b8bc4,73790,0a39e,128be,b6b28,88e4f,542df,60ef6,321d8,9be31,48d07,c007a,aea99,837e2,16d82,b260a,21389,906cb,59cd0,d8d85,2da51,674bb,14ef3,71dca,0634a,893a8,0e0ec,e7ef9,15d22,80b13,436f7,6e7e9,60428,a04f4,efbe5,2e063,6aae7,614f2,a4e14,ab335,0e9e1,f1b88,a30a3,8e766,b2b36,b32a7,208cd,8d6c5,ca538,35b1c,c128c,88bc8,c123a,0ec98,83b56,6641b,b279a,242e0,a14a9,0bed0,9795a,d802e,ab41c,db4a1,e0842,05f62,44840,e79b5,91a1e,69132,bf47f,80796,3cf91,779b3,52c5e,44de9,dcd2f,ceee6,993c9,c4552,2b757,3a24b,0766b,4d6fa,d9703,bda35,3cb4e,eeef6,3cd13,9103a,8fcad,c6424,76bcc,6746e,8022f,ad831,5fef8,53998,abfa9,93c85,f2b94,a34ce,c831f,14088,81cff,c2719,23fe1,e0774,4d34d,e2912,c516b,8d197,8bc73,9269f,3cbdc,05320,e7e31,78e16,b61d6,a8bec,cce92,f68ba,a0252,d058b,f8dde,e1e40,b733e,b812d,aea74,e417f,beca3,f7c37,bff53,a0c66,c87c0,f0175,e996e,f1b8f,b20eb,c5b0f,e72f8,d57bf,fffcc,de5e6,ef8b1,b20f2,e3834,cb709,fc06f,a4af2,e376f,d79a1,acd5a,ba00b,d5399,f8091,c218a,d67d2,a740f,d324d,fa535,a1ffa,bcbed,e7755,cc3f0,aaef0,b6c1d,a3f35,ebed2,f8175,bebd6,cdf38,f8679,ccf4a,f1c28,bf561,b1477,babb5,ac9f8,f2f75,fbc74,c0f8d,c72e1,bc708,d9ac6,ea6fa,da81e,df72f,faa0e,addb6,c8ac1,a8a27,f3548,f634b,ec2cf,a4d17,e68d1,eff90,d51b1,bdfee,e0dc2,e421a,b8a92,e1b62,d01d0,cc671,c153a,fc9cf,df350,d2c22,df916,c5d21,f765e,aa2d4,a9d8d,c2b36,ca016,b4e29,ac2fa,e4599,b83cd,f586d,ef7bf,b0c07,fc6d2,eeec7,e720e,b558e,ef56f,b3353,a6de8,fab1d,c921e,b6a29,a0019,b03ef,cc742,a3d8d,b4e35,e6668,fe71a,fd836,cded9,c458e,d454b,b2f57,deda8,ab459,f47b9,f0356,d2f78,e6b0c,ac32f,f78d5,d1c4e,da652,b391c,a2fc6,a77f4,de655,f8baf,b66e9,c8b74,d1dfa,ee79c,c3612,ec3f1,ea5c6,d0a61,c08cf,d7f50,f2685,ab910,f173e,d1499,ebcdd,f762d,e97b3,a47eb,e32c8,db61e,c8dca,fcfdd,e0051,d40d5,cfec0,f7cf1,af433,e4017,bc343,d5ba9,d60fe,bb346,a1b8c,ab5ed,b65e5,ec1cf,a7798,e6737,ea3b8,e1076,ce973,a5c75,ce15f,d70aa,dbc04,dce8a,b5565,a68f0,abc90,c1792,eddfa,ab24c,ac20a,a732d,e49d4,e7fb1,fface,e48cf,fa808,cc8b2,fe112,bcf86,c0e70,e085b,bc95e,fc890,cf8b5,cac27,c4e9c,ce788,ca5c8,b554b,d68a4,db427,b20b7,be61d,b0fc3,b18a2,c7946,ea80b,abb99,b9529,ddd67,b9d3e,b8b96,fc5da,ec6c4,d5cea,e1a86,f55b7,b4a50,cf3cb,e270a,d15ab,f7603,b0b77,b3cfa,c73e0,519d9,6d9af,ff6ac,0e915,9ac33,d927d,b47e4,f63f1,4e160,c5f3b,616e0,6cd0d,89473,f4f21,9bcb3,2867b,807a8,130fb,12c69,017d6,6aeaa,fe06a,59fbd,4a0ff,d27ca,eb735,819ab,4f065,f4a64,a6615,95ff2,e5b71,2c989,8b864,67974,8109c,a376e,9e031,80cf3,e7b13,084fd,ff06b,2e73a,50e6b,238d6,6e76a,859eb,467bf,21e1a,961a7,bd190,86f9b,93a64,d72d6,b12b0,ca724,46d09,b8699,78c25,a7f66,ffcd7,7b165,04636,5841a,c1cc5,290e8,c97bc,88155,a4330,8619d,9bc46,dbb4a,81427,1fe18,69abd,4639e,31152,3a172,e6d89,65b42,08412,d087e,c6cc0,85d52,ba77a,0afcd,33a5e,b60ef,ca7d7,8e6ac,71375,61b21,eb252,0ad2c,2dd2f,5a7b7,c0fdf,cf98b,38e59,e8b7c,118f9,fd67f,2c3bb,d4024,c4a96,09f62,22173,f308d,df4cc,ce9b0,90577,397c5,fe055,19b64,efa09,10e0c,53c4b,14258,9bbfc,6bb19,0ed7e,38f11,4fa06,e5dd8,cf457,52ea4,38e5c,4193a,9b59f,26352,71e5e,d923d,c204e,7170f,d5470,c16c1,a5b36,51fe7,80b37,44ac6,37b37,1243e,d3937,3e01b,65c9d,3fea6,2a2c3,1cb7a,9821c,7971e,1163c,49c8f,646d7,54584,32bfa,b9ed8,8b4b7,7b211,4b622,69a0d,75b11,45352,44bf2,f7294,f1353,284e5,c00dd,57854,44d40,128d2,2205b,94278,e9c3b,6f533,b6b90,58638,4f718,0f36c,1ac30,34b2f,72e3c,2b14a,574e6,41689,66c69,79bf5,7e87f,8bb46,d1d95,b3c79,0b2b6,95c8c,321df,61c35,d4e94,16c87,e4408,8bd10,003c0,5a1d3,42799,edac5,005e8,cb158,f6bf8,547de,1840d,a2f3b,3d25a,965f0,5bd1b,cf0d7,40922,a579e,aa5be,d9a9d,bc86a,fd8c1,d84cc,e066a,a8c25,edd5e,36d33,2c773,fecd1,f52c9,e32ed,b3cc3,8aaae,1dccc,aa0f0,a73f1,7d027,68418,73131,8e24f,8a579,e6fb8,f1bbf,801c6,d4eb6,e79a0,85d3d,077c8,e001c,45cf4,8362b,c00e3,047f6,e5678,4d14b,c045d,1f4a4,a99b0,1e938,d3633,b84e8,f3c29,2fc4a,195c0,6000b,49b05,0203b,6effd,9a31a,3ba9a,eaa09,f0573,6d220,2fa6c,f9acd,c1cb6,43b6e,fa908,ba1cf,dd3c4,64fdd,d7ddf,e6f82,b4c0a,79d4f,df8c9,9c0a2,b121c,aa5c5,ac2b2,a5fa8,ee70b,56c48,23082,f6de7,d5275,e97d9,4ee52,c8971,c8eb5,927a3,33416,4507d,b5113,e7f9d,853ef,97b3c,3cfb2,fecd3,aa1d3,178e3,10b7d,c8c06,f33bb,e046d,0c159,abf33,0a7bd,a990d,1e31d,daf75,49268,f0e5e,6b4f8,bd4f8,684b6,b2701,9c972,69103,17d69,2ea7b,ce1c0,30081,02465,859de,6b269,307f5,d2870,8ffaf,222a3,183b4,a9bfb,d693a,bc73e,9b2eb,b7749,abf5d,abeaa,8de3b,a1a23,08fb6,52c84,23d21,2ae95,636ee,7cd61,98a4f,f12bb,f5aa6,2566e,b107b,b4fa8,3bff4,a174e,322cf,1eea3,f80c5,a0e5a,34b1f,afa49,29305,9585c,a236a,82381,d7a96,41172,2a1e0,6ce2a,2f303,49452,5f567,e7b8a,edddd,5ec6d,e8369,bea53,7b903,212b0,35f9b,0cc9a,24b24,5cfe9,b78ae,ce597,8bfb9,ac70b,da0b2,e4568,e6786,d5da8,abcda,d85eb,b0fd9,ed9a4,f0b48,bce0a,fdf29,fbefb,ab641,fb21f,d770c,ec608,d6e33,f73e8,c661f,efc46,ba9d4,f911c,ed8bc,ef63d,be67e,c9c90,cfa3e,2a692,88fbb,88658,c0bd1,568f4,2cca3,3cd85,91402,eca62,d4863,70705,d47f6,3b054,29a98,1e3c8,ccc22,fcaa3,43b54,5eaf6,5c85d,dfcfa,6497c,da3cc,bbd1c,62646,212f6,49141,b6396,6bfe9,14b90,47364,8798b,0eec5,83039,8b0b5,f43f7,b2dea,a8723,4e5bd,c741c,33380,43429,503c6,05fcb,70446,35d8c,4d9c9,25e1c,c5ad4,a2372,07ba3,53515,2ddaa,549c1,c92ea,39a17,ea5ef,f7ca4,e1817,86aea,ca38a,243cf,2a4ee,42e8d,09f91,3f35d,11e0b,072c6,73b62,d4333,5b9c6,0b507,f027c,5aea2,e78cd,602cd,c723b,54fa6,9d161,11fbd,fd2e1,6425d,fb2b7,0ea23,368fa,78d15,95162,19158,3bda1,e1861,8f8eb,5cfa8,786db,60697,2048c,c10b7,99425,7e170,44af8,0fa2e,783ba,83705,5c305,85b33,f9b21,4c0b1,f5ac0,5fddb,58158,8c539,252fe,871b9,24fc4,09f74,b9564,a7a67,d5212,657d8,9e71f,d5671,4f2f5,dde4a,d9948,5e4f0,0fce3,30b17,76889,c8367,330a4,2136b,9b3aa,da6f6,03acc,b8b49,9b26d,efb1c,0335b,e1ff8,84748,fc362,bb949,52b1f,d889e,7d679,278c4,1e33a,61585,0e2a8,04ad2,66e1d,35e89,c3f4e,24d88,63e36,a8539,8b54b,99028,4f644,681cc,d4b63,45a57,87437,310a7,26880,a631e,7a736,b2b3f,8af48,ddf01,a1e78,2dd02,90ae2,8ebd5,a922c,34e3a,96c58,1d849,ab04f,d9e12,fdec7,0b9d5,1cc77,6a6eb,76eac,e00c4,1f79d,2568c,060cb,c974f,77894,e3639,99176,08f47,7d3e8,5d5f9,fa17e,21ae2,445f8,4a294,c6cd0,db775,6e474,ecb0a,aa3c7,cf79b,30d3b,376b0,e28bd,0a9df,fc6d8,70456,d7f5e,74deb,bd184,01761,0ab46,9e209,c2780,fae95,efd55,5129f,d981e,b095f,d990a,dca62,de660,a0943,d303e,d3db8,dd9f2,a01c5,e7852,e71e3,f88b3,aed38,e4038,c05db,b4281,bddb4,b1ecf,f74af,e9fcc,d5370,f319a,a4931,b400e,adef2,d3047,eb6e6,a6c84,c769b,b574d,b1277,dbd4c,d8939,b475c,b926d,e4ecd,cf96a,b1460,43831,4be83,ab827,09298,de6ca,39c10,9b2b8,5ff22,48aeb,82f57,7ca7d,b4173,0b0a2,ae848,953e8,de7cc,86e55,109bf,a7f08,55208,1b0aa,db1e7,8ac2e,02a6c,33e3b,7d133,b1c42,fae58,039c3,26425,ecf67,5dd2c,cf4b7,3af32,a1ec0,37572,85680,c331c,27222,f11ad,c619f,ee425,5286a,9f093,d3dcf,80ac4,97e71,d29fa,73988,abce4,c1072,8b749,abb35,e3c04,3ebcf,27a67,1eca2,3ca38,f0d9a,08a0f,6efd4,535ff,b0cfa,2be29,2e6f9,5e5a0,0331c,8797d,42700,b2d3b,ee98d,28198,a2b02,ba561,2dd94,f935c,42380,b5ebc,9887c,9970d,5ce92,7606b,d2f61,4cdcc,e4e52,f9e1f,99047,57680,f08fb,e1603,ff658,1e293,d7b66,b1ed6,94ac8,92d2c,f5541,20a5b,48979,f0bdd,7c362,4fc99,154c4,dfaec,f9a9c,95fb6,b2414,a1a57,b0e99,a8ea1,c28af,d943c,d8e2c,b99f1,b94b8,fa1d8,c4fcc,f8c1b,cb560,a8fdf,b4364,d463e,e9545,f21fc,c720e,c8821,f9f0d,ab201,d9367,f9a18,f7ae4,b415c,dbcd8,a60ba,a195a,ede25,fb8b9,cdad5,f52b0,d000a,f02ff,f898e,b1cdc,b6340,e4fa1,bd39e,a3d71,c79c1,c4a20,ce523,fb29c,eec0c,3576d,41eed,aaba2,c0138,7fd3e,ee164,46c17,8302c,e6b90,fca49,97736,c3502,e35a4,585e2,eee72,ee94f,a2ba8,0a40d,47864,90832,ba75e,4e3a9,07d62,75b2c,76f65,eca2f,b389c,66a17,b01bf,c5098,d62fa,89687,94f69,6da2f,dc76e,6c125,6802e,a69a4,96eed,c12c3,d40fe,d8dee,e95ce,f8a38,dbd1f,f7681,d825b,c4903";

3、Python代码

完整代码如下:

import requests
import time
import urllib.request
import docx
#创建内存中的word文档对象
from docx.shared import Pt
from docx.enum.text import WD_PARAGRAPH_ALIGNMENT

def download_img(img_url,imgname):
  request = urllib.request.Request(img_url)
  try:
    response = urllib.request.urlopen(request)
    img_name = imgname+'.gif'
    filename = "D:\\Program Files (x86)\\PyCharm Community Edition 2020.1.1\\PaQukemu4tiku\\folder\\imgpath\\" + img_name
    if (response.getcode() == 200):
      with open(filename, "wb") as f:
        f.write(response.read()) # 将内容写入图片
      return filename
  except:
    return "failed"

ExamVersion = "20210205090935"
ExamCount = 1487
ExamCodes = "d704f,bdb25,2d1d4,a9eb5,b0671,9b18f,2b0c4,8f1ec,07f6d,ed8f7,a74a2,46806,faa42,a580e,4fc7a,0fb21,97686,37b6b,c9263,f1b18,79e07,e9c0b,01e28,d345f,fcb88,f1301,75642,2a3b2,f70b6,2993b,b493e,64a3d,c9fdd,21f14,53c76,ddd2d,7f5d0,f8543,ee954,533ba,dcd4a,2c15b,8ac1a,e97d7,10e03,0202c,b109d,ebf95,0f7f1,0bca9,36b59,9eefd,eb795,b2a2b,704ea,856e9,9e92c,e76a1,e3970,c36f5,5a407,ee9d9,aeb78,0a570,bdc91,f8f59,c08f0,bb45b,515d9,b347d,28b17,b3dab,bbbbd,94c53,54a92,f1e50,09587,c7890,7a9b3,354b9,e2803,78c8c,256e7,3a8cc,43137,883e8,4f318,b7641,f3192,5a496,1c1fa,7e43d,a50ed,87ad2,9ea88,c788b,95c60,ca436,007c7,a7057,cab0a,5afed,e32ce,7a961,157f8,fc304,12c14,179bc,82e35,863ac,bd2f8,5655d,62935,ff906,8eda8,bad13,8714d,af074,303f7,d4bf5,c78bc,97940,7a981,c87e2,bb685,eb46c,b14b5,7c563,7fc96,e5540,7134a,b85c7,49e6f,9d464,aba3d,60a01,e11cd,bc16b,8ff78,cd884,b2efe,79e6b,dd188,969c7,bc7d9,a856f,c8f91,bb731,45056,658d9,ba732,ebf44,5e748,33f22,58c97,4f211,94644,ee187,881a9,97ecd,6017d,c0b7e,1ad72,5c82d,8146e,3b63a,32c78,c5ac6,9a9ad,9c436,3ce2d,f7e51,b1371,28e07,e39fb,a74c9,2d870,73ccd,e5cfc,04f25,77e6e,9c259,59e10,9f83a,a8d5b,40245,6b3b1,473e8,a215a,b6313,2578c,27457,ca78c,140ef,dbab9,67aac,be529,9fa10,c158f,3e964,5d883,c996e,1e2bb,fc9c7,c096c,07512,683c0,ac2bf,981ea,54456,5f324,90575,db76b,817db,60ea8,8502e,b30e3,90414,a44b7,87ad5,d6e51,6112b,3f620,d23ee,c3f1f,5c057,aafad,63844,0b51d,51231,6679a,bc264,e1353,97cd8,50512,56ecd,610f0,205d7,ed246,c0b90,c9b5c,6d806,48695,961e8,dec7c,71b2c,6a3c6,02b1d,8635f,b8bc4,73790,0a39e,128be,b6b28,88e4f,542df,60ef6,321d8,9be31,48d07,c007a,aea99,837e2,16d82,b260a,21389,906cb,59cd0,d8d85,2da51,674bb,14ef3,71dca,0634a,893a8,0e0ec,e7ef9,15d22,80b13,436f7,6e7e9,60428,a04f4,efbe5,2e063,6aae7,614f2,a4e14,ab335,0e9e1,f1b88,a30a3,8e766,b2b36,b32a7,208cd,8d6c5,ca538,35b1c,c128c,88bc8,c123a,0ec98,83b56,6641b,b279a,242e0,a14a9,0bed0,9795a,d802e,ab41c,db4a1,e0842,05f62,44840,e79b5,91a1e,69132,bf47f,80796,3cf91,779b3,52c5e,44de9,dcd2f,ceee6,993c9,c4552,2b757,3a24b,0766b,4d6fa,d9703,bda35,3cb4e,eeef6,3cd13,9103a,8fcad,c6424,76bcc,6746e,8022f,ad831,5fef8,53998,abfa9,93c85,f2b94,a34ce,c831f,14088,81cff,c2719,23fe1,e0774,4d34d,e2912,c516b,8d197,8bc73,9269f,3cbdc,05320,e7e31,78e16,b61d6,a8bec,cce92,f68ba,a0252,d058b,f8dde,e1e40,b733e,b812d,aea74,e417f,beca3,f7c37,bff53,a0c66,c87c0,f0175,e996e,f1b8f,b20eb,c5b0f,e72f8,d57bf,fffcc,de5e6,ef8b1,b20f2,e3834,cb709,fc06f,a4af2,e376f,d79a1,acd5a,ba00b,d5399,f8091,c218a,d67d2,a740f,d324d,fa535,a1ffa,bcbed,e7755,cc3f0,aaef0,b6c1d,a3f35,ebed2,f8175,bebd6,cdf38,f8679,ccf4a,f1c28,bf561,b1477,babb5,ac9f8,f2f75,fbc74,c0f8d,c72e1,bc708,d9ac6,ea6fa,da81e,df72f,faa0e,addb6,c8ac1,a8a27,f3548,f634b,ec2cf,a4d17,e68d1,eff90,d51b1,bdfee,e0dc2,e421a,b8a92,e1b62,d01d0,cc671,c153a,fc9cf,df350,d2c22,df916,c5d21,f765e,aa2d4,a9d8d,c2b36,ca016,b4e29,ac2fa,e4599,b83cd,f586d,ef7bf,b0c07,fc6d2,eeec7,e720e,b558e,ef56f,b3353,a6de8,fab1d,c921e,b6a29,a0019,b03ef,cc742,a3d8d,b4e35,e6668,fe71a,fd836,cded9,c458e,d454b,b2f57,deda8,ab459,f47b9,f0356,d2f78,e6b0c,ac32f,f78d5,d1c4e,da652,b391c,a2fc6,a77f4,de655,f8baf,b66e9,c8b74,d1dfa,ee79c,c3612,ec3f1,ea5c6,d0a61,c08cf,d7f50,f2685,ab910,f173e,d1499,ebcdd,f762d,e97b3,a47eb,e32c8,db61e,c8dca,fcfdd,e0051,d40d5,cfec0,f7cf1,af433,e4017,bc343,d5ba9,d60fe,bb346,a1b8c,ab5ed,b65e5,ec1cf,a7798,e6737,ea3b8,e1076,ce973,a5c75,ce15f,d70aa,dbc04,dce8a,b5565,a68f0,abc90,c1792,eddfa,ab24c,ac20a,a732d,e49d4,e7fb1,fface,e48cf,fa808,cc8b2,fe112,bcf86,c0e70,e085b,bc95e,fc890,cf8b5,cac27,c4e9c,ce788,ca5c8,b554b,d68a4,db427,b20b7,be61d,b0fc3,b18a2,c7946,ea80b,abb99,b9529,ddd67,b9d3e,b8b96,fc5da,ec6c4,d5cea,e1a86,f55b7,b4a50,cf3cb,e270a,d15ab,f7603,b0b77,b3cfa,c73e0,519d9,6d9af,ff6ac,0e915,9ac33,d927d,b47e4,f63f1,4e160,c5f3b,616e0,6cd0d,89473,f4f21,9bcb3,2867b,807a8,130fb,12c69,017d6,6aeaa,fe06a,59fbd,4a0ff,d27ca,eb735,819ab,4f065,f4a64,a6615,95ff2,e5b71,2c989,8b864,67974,8109c,a376e,9e031,80cf3,e7b13,084fd,ff06b,2e73a,50e6b,238d6,6e76a,859eb,467bf,21e1a,961a7,bd190,86f9b,93a64,d72d6,b12b0,ca724,46d09,b8699,78c25,a7f66,ffcd7,7b165,04636,5841a,c1cc5,290e8,c97bc,88155,a4330,8619d,9bc46,dbb4a,81427,1fe18,69abd,4639e,31152,3a172,e6d89,65b42,08412,d087e,c6cc0,85d52,ba77a,0afcd,33a5e,b60ef,ca7d7,8e6ac,71375,61b21,eb252,0ad2c,2dd2f,5a7b7,c0fdf,cf98b,38e59,e8b7c,118f9,fd67f,2c3bb,d4024,c4a96,09f62,22173,f308d,df4cc,ce9b0,90577,397c5,fe055,19b64,efa09,10e0c,53c4b,14258,9bbfc,6bb19,0ed7e,38f11,4fa06,e5dd8,cf457,52ea4,38e5c,4193a,9b59f,26352,71e5e,d923d,c204e,7170f,d5470,c16c1,a5b36,51fe7,80b37,44ac6,37b37,1243e,d3937,3e01b,65c9d,3fea6,2a2c3,1cb7a,9821c,7971e,1163c,49c8f,646d7,54584,32bfa,b9ed8,8b4b7,7b211,4b622,69a0d,75b11,45352,44bf2,f7294,f1353,284e5,c00dd,57854,44d40,128d2,2205b,94278,e9c3b,6f533,b6b90,58638,4f718,0f36c,1ac30,34b2f,72e3c,2b14a,574e6,41689,66c69,79bf5,7e87f,8bb46,d1d95,b3c79,0b2b6,95c8c,321df,61c35,d4e94,16c87,e4408,8bd10,003c0,5a1d3,42799,edac5,005e8,cb158,f6bf8,547de,1840d,a2f3b,3d25a,965f0,5bd1b,cf0d7,40922,a579e,aa5be,d9a9d,bc86a,fd8c1,d84cc,e066a,a8c25,edd5e,36d33,2c773,fecd1,f52c9,e32ed,b3cc3,8aaae,1dccc,aa0f0,a73f1,7d027,68418,73131,8e24f,8a579,e6fb8,f1bbf,801c6,d4eb6,e79a0,85d3d,077c8,e001c,45cf4,8362b,c00e3,047f6,e5678,4d14b,c045d,1f4a4,a99b0,1e938,d3633,b84e8,f3c29,2fc4a,195c0,6000b,49b05,0203b,6effd,9a31a,3ba9a,eaa09,f0573,6d220,2fa6c,f9acd,c1cb6,43b6e,fa908,ba1cf,dd3c4,64fdd,d7ddf,e6f82,b4c0a,79d4f,df8c9,9c0a2,b121c,aa5c5,ac2b2,a5fa8,ee70b,56c48,23082,f6de7,d5275,e97d9,4ee52,c8971,c8eb5,927a3,33416,4507d,b5113,e7f9d,853ef,97b3c,3cfb2,fecd3,aa1d3,178e3,10b7d,c8c06,f33bb,e046d,0c159,abf33,0a7bd,a990d,1e31d,daf75,49268,f0e5e,6b4f8,bd4f8,684b6,b2701,9c972,69103,17d69,2ea7b,ce1c0,30081,02465,859de,6b269,307f5,d2870,8ffaf,222a3,183b4,a9bfb,d693a,bc73e,9b2eb,b7749,abf5d,abeaa,8de3b,a1a23,08fb6,52c84,23d21,2ae95,636ee,7cd61,98a4f,f12bb,f5aa6,2566e,b107b,b4fa8,3bff4,a174e,322cf,1eea3,f80c5,a0e5a,34b1f,afa49,29305,9585c,a236a,82381,d7a96,41172,2a1e0,6ce2a,2f303,49452,5f567,e7b8a,edddd,5ec6d,e8369,bea53,7b903,212b0,35f9b,0cc9a,24b24,5cfe9,b78ae,ce597,8bfb9,ac70b,da0b2,e4568,e6786,d5da8,abcda,d85eb,b0fd9,ed9a4,f0b48,bce0a,fdf29,fbefb,ab641,fb21f,d770c,ec608,d6e33,f73e8,c661f,efc46,ba9d4,f911c,ed8bc,ef63d,be67e,c9c90,cfa3e,2a692,88fbb,88658,c0bd1,568f4,2cca3,3cd85,91402,eca62,d4863,70705,d47f6,3b054,29a98,1e3c8,ccc22,fcaa3,43b54,5eaf6,5c85d,dfcfa,6497c,da3cc,bbd1c,62646,212f6,49141,b6396,6bfe9,14b90,47364,8798b,0eec5,83039,8b0b5,f43f7,b2dea,a8723,4e5bd,c741c,33380,43429,503c6,05fcb,70446,35d8c,4d9c9,25e1c,c5ad4,a2372,07ba3,53515,2ddaa,549c1,c92ea,39a17,ea5ef,f7ca4,e1817,86aea,ca38a,243cf,2a4ee,42e8d,09f91,3f35d,11e0b,072c6,73b62,d4333,5b9c6,0b507,f027c,5aea2,e78cd,602cd,c723b,54fa6,9d161,11fbd,fd2e1,6425d,fb2b7,0ea23,368fa,78d15,95162,19158,3bda1,e1861,8f8eb,5cfa8,786db,60697,2048c,c10b7,99425,7e170,44af8,0fa2e,783ba,83705,5c305,85b33,f9b21,4c0b1,f5ac0,5fddb,58158,8c539,252fe,871b9,24fc4,09f74,b9564,a7a67,d5212,657d8,9e71f,d5671,4f2f5,dde4a,d9948,5e4f0,0fce3,30b17,76889,c8367,330a4,2136b,9b3aa,da6f6,03acc,b8b49,9b26d,efb1c,0335b,e1ff8,84748,fc362,bb949,52b1f,d889e,7d679,278c4,1e33a,61585,0e2a8,04ad2,66e1d,35e89,c3f4e,24d88,63e36,a8539,8b54b,99028,4f644,681cc,d4b63,45a57,87437,310a7,26880,a631e,7a736,b2b3f,8af48,ddf01,a1e78,2dd02,90ae2,8ebd5,a922c,34e3a,96c58,1d849,ab04f,d9e12,fdec7,0b9d5,1cc77,6a6eb,76eac,e00c4,1f79d,2568c,060cb,c974f,77894,e3639,99176,08f47,7d3e8,5d5f9,fa17e,21ae2,445f8,4a294,c6cd0,db775,6e474,ecb0a,aa3c7,cf79b,30d3b,376b0,e28bd,0a9df,fc6d8,70456,d7f5e,74deb,bd184,01761,0ab46,9e209,c2780,fae95,efd55,5129f,d981e,b095f,d990a,dca62,de660,a0943,d303e,d3db8,dd9f2,a01c5,e7852,e71e3,f88b3,aed38,e4038,c05db,b4281,bddb4,b1ecf,f74af,e9fcc,d5370,f319a,a4931,b400e,adef2,d3047,eb6e6,a6c84,c769b,b574d,b1277,dbd4c,d8939,b475c,b926d,e4ecd,cf96a,b1460,43831,4be83,ab827,09298,de6ca,39c10,9b2b8,5ff22,48aeb,82f57,7ca7d,b4173,0b0a2,ae848,953e8,de7cc,86e55,109bf,a7f08,55208,1b0aa,db1e7,8ac2e,02a6c,33e3b,7d133,b1c42,fae58,039c3,26425,ecf67,5dd2c,cf4b7,3af32,a1ec0,37572,85680,c331c,27222,f11ad,c619f,ee425,5286a,9f093,d3dcf,80ac4,97e71,d29fa,73988,abce4,c1072,8b749,abb35,e3c04,3ebcf,27a67,1eca2,3ca38,f0d9a,08a0f,6efd4,535ff,b0cfa,2be29,2e6f9,5e5a0,0331c,8797d,42700,b2d3b,ee98d,28198,a2b02,ba561,2dd94,f935c,42380,b5ebc,9887c,9970d,5ce92,7606b,d2f61,4cdcc,e4e52,f9e1f,99047,57680,f08fb,e1603,ff658,1e293,d7b66,b1ed6,94ac8,92d2c,f5541,20a5b,48979,f0bdd,7c362,4fc99,154c4,dfaec,f9a9c,95fb6,b2414,a1a57,b0e99,a8ea1,c28af,d943c,d8e2c,b99f1,b94b8,fa1d8,c4fcc,f8c1b,cb560,a8fdf,b4364,d463e,e9545,f21fc,c720e,c8821,f9f0d,ab201,d9367,f9a18,f7ae4,b415c,dbcd8,a60ba,a195a,ede25,fb8b9,cdad5,f52b0,d000a,f02ff,f898e,b1cdc,b6340,e4fa1,bd39e,a3d71,c79c1,c4a20,ce523,fb29c,eec0c,3576d,41eed,aaba2,c0138,7fd3e,ee164,46c17,8302c,e6b90,fca49,97736,c3502,e35a4,585e2,eee72,ee94f,a2ba8,0a40d,47864,90832,ba75e,4e3a9,07d62,75b2c,76f65,eca2f,b389c,66a17,b01bf,c5098,d62fa,89687,94f69,6da2f,dc76e,6c125,6802e,a69a4,96eed,c12c3,d40fe,d8dee,e95ce,f8a38,dbd1f,f7681,d825b,c4903"
ExamCodes = ExamCodes.split(',')

file = docx.Document()
headers ={
'Connection': 'close'
}
for i in range(0, ExamCount):
  ExamCodei=ExamCodes[i]
  urlselecti = 'https://tkdata.mnks.cn/ExamData/'+ ExamCodei +'.json?CALL=?20201231143735.json' #选择题
  responsei = requests.get(urlselecti, headers=headers,timeout=50,verify=False)
  resulti = responsei.json()
  ExamTi = resulti['tm'].split('<br/>')
  if(len(ExamTi) > 1):
    print(str(i+1)+'、'+ExamTi[0] + ' 答案:' + resulti['da'] + '\n  ' + ExamTi[1] + ' ' + ExamTi[2] + ' ' + ExamTi[3] + ' ' + ExamTi[4])
    file.add_paragraph(str(i+1)+'、'+ExamTi[0] + ' 答案:' + resulti['da'])
    if (resulti['tv'] != ''):
      ExamTiimg = resulti['tv'].split('/')
      ExamTiimgurl = 'https://sucimg.itc.cn/sblog/' + ExamTiimg[2]
      print(ExamTiimgurl)
      download_img(ExamTiimgurl, ExamTiimg[2])
      paragraph = file.add_paragraph() # 图片居中设置
      paragraph.alignment = WD_PARAGRAPH_ALIGNMENT.CENTER
      run = paragraph.add_run("")
      run.add_picture("D:\\Program Files (x86)\\PyCharm Community Edition 2020.1.1\\PaQukemu4tiku\\folder\\imgpath\\" + ExamTiimg[2] + '.gif')
    file.add_paragraph('    '+ExamTi[1] + '   ' + ExamTi[2] + '   ' + ExamTi[3] + '   ' + ExamTi[4])
  else:
    print(str(i+1)+'、'+resulti['tm'] + ' 答案:' + resulti['da'])
    file.add_paragraph(str(i+1)+'、'+resulti['tm'] + ' 答案:' + resulti['da'])
    if (resulti['tv'] != ''):
      ExamTiimg = resulti['tv'].split('/')
      ExamTiimgurl = 'https://sucimg.itc.cn/sblog/' + ExamTiimg[2]
      print(ExamTiimgurl)
      download_img(ExamTiimgurl, ExamTiimg[2])
      paragraph = file.add_paragraph() # 图片居中设置
      paragraph.alignment = WD_PARAGRAPH_ALIGNMENT.CENTER
      run = paragraph.add_run("")
      run.add_picture("D:\\Program Files (x86)\\PyCharm Community Edition 2020.1.1\\PaQukemu4tiku\\folder\\imgpath\\" + ExamTiimg[2] + '.gif')
  #time.sleep(1)
file.save("D:\\Program Files (x86)\\PyCharm Community Edition 2020.1.1\\PaQukemu4tiku\\folder\\C1科目四1487题.docx") #保存

4、运行结果

在项目文件夹下folder中imgpath保存所有题目的图片,C1科目四1487题.docx就是运行结果。
打开Word文档进行查看:

word文档可另存为pdf

与科目一不同的是,科目四里很多选择题中的图片是动态的GIF图,而不是静态的png,所以题目保存到Word中后图片并不会动态显示,因此,考虑将Word文档另存为网页文件(.html)

结果(包含动画)如下图所示:

到此这篇关于Python爬取科目四考试题库的方法实现的文章就介绍到这了,更多相关Python爬取考试题库内容请搜索我们以前的文章或继续浏览下面的相关文章希望大家以后多多支持我们!

时间: 2021-03-30

通过抓取淘宝评论为例讲解Python爬取ajax动态生成的数据(经典)

在学习python的时候,一定会遇到网站内容是通过 ajax动态请求.异步刷新生成的json数据 的情况,并且通过python使用之前爬取静态网页内容的方式是不可以实现的,所以这篇文章将要讲述如果在python中爬取ajax动态生成的数据. 至于读取静态网页内容的方式,有兴趣的可以查看本文内容. 这里我们以爬取淘宝评论为例子讲解一下如何去做到的. 这里主要分为了四步: 一 获取淘宝评论时,ajax请求链接(url) 二 获取该ajax请求返回的json数据 三 使用python解析json数据

python爬虫爬取网页表格数据

用python爬取网页表格数据,供大家参考,具体内容如下 from bs4 import BeautifulSoup import requests import csv import bs4 #检查url地址 def check_link(url): try: r = requests.get(url) r.raise_for_status() r.encoding = r.apparent_encoding return r.text except: print('无法链接服务器!!!')

Python3实现的爬虫爬取数据并存入mysql数据库操作示例

本文实例讲述了Python3实现的爬虫爬取数据并存入mysql数据库操作.分享给大家供大家参考,具体如下: 爬一个电脑客户端的订单.罗总推荐,抓包工具用的是HttpAnalyzerStdV7,与chrome自带的F12类似.客户端有接单大厅,罗列所有订单的简要信息.当单子被接了,就不存在了.我要做的是新出订单就爬取记录到我的数据库zyc里. 设置每10s爬一次. 抓包工具页面如图: 首先是爬虫,先找到数据存储的页面,再用正则爬出. # -*- coding:utf-8 -*- import re

Python实现爬取知乎神回复简单爬虫代码分享

看知乎的时候发现了一个 "如何正确地吐槽" 收藏夹,里面的一些神回复实在很搞笑,但是一页一页地看又有点麻烦,而且每次都要打开网页,于是想如果全部爬下来到一个文件里面,是不是看起来很爽,并且随时可以看到全部的,于是就开始动手了. 工具 1.Python 2.7 2.BeautifulSoup 分析网页 我们先来看看知乎上该网页的情况 网址:,容易看到,网址是有规律的,page慢慢递增,这样就能够实现全部爬取了. 再来看一下我们要爬取的内容: 我们要爬取两个内容:问题和回答,回答仅限于显示

Python爬虫:通过关键字爬取百度图片

使用工具:Python2.7 点我下载 scrapy框架 sublime text3 一.搭建python(Windows版本)  1.安装python2.7 ---然后在cmd当中输入python,界面如下则安装成功  2.集成Scrapy框架----输入命令行:pip install Scrapy 安装成功界面如下: 失败的情况很多,举例一种: 解决方案: 其余错误可百度搜索. 二.开始编程. 1.爬取无反爬虫措施的静态网站.例如百度贴吧,豆瓣读书. 例如-<桌面吧>的一个帖子https:

Python爬虫爬取一个网页上的图片地址实例代码

本文实例主要是实现爬取一个网页上的图片地址,具体如下. 读取一个网页的源代码: import urllib.request def getHtml(url): html=urllib.request.urlopen(url).read() return html print(getHtml(http://image.baidu.com/search/flip?tn=baiduimage&ie=utf-8&word=%E5%A3%81%E7%BA%B8&ct=201326592&am

python爬虫实战之爬取京东商城实例教程

前言 本文主要介绍的是利用python爬取京东商城的方法,文中介绍的非常详细,下面话不多说了,来看看详细的介绍吧. 主要工具 scrapy BeautifulSoup requests 分析步骤 1.打开京东首页,输入裤子将会看到页面跳转到了这里,这就是我们要分析的起点 2.我们可以看到这个页面并不是完全的,当我们往下拉的时候将会看到图片在不停的加载,这就是ajax,但是当我们下拉到底的时候就会看到整个页面加载了60条裤子的信息,我们打开chrome的调试工具,查找页面元素时可以看到每条裤子的信

python爬取网站数据保存使用的方法

编码问题因为涉及到中文,所以必然地涉及到了编码的问题,这一次借这个机会算是彻底搞清楚了.问题要从文字的编码讲起.原本的英文编码只有0~255,刚好是8位1个字节.为了表示各种不同的语言,自然要进行扩充.中文的话有GB系列.可能还听说过Unicode和UTF-8,那么,它们之间是什么关系呢?Unicode是一种编码方案,又称万国码,可见其包含之广.但是具体存储到计算机上,并不用这种编码,可以说它起着一个中间人的作用.你可以再把Unicode编码(encode)为UTF-8,或者GB,再存储到计算机

Python实现爬取需要登录的网站完整示例

本文实例讲述了Python爬取需要登录的网站实现方法.分享给大家供大家参考,具体如下: import requests from lxml import html # 创建 session 对象.这个对象会保存所有的登录会话请求. session_requests = requests.session() # 提取在登录时所使用的 csrf 标记 login_url = "https://bitbucket.org/account/signin/?next=/" result = se

实例讲解Python爬取网页数据

一.利用webbrowser.open()打开一个网站: >>> import webbrowser >>> webbrowser.open('http://i.firefoxchina.cn/?from=worldindex') True 实例:使用脚本打开一个网页. 所有Python程序的第一行都应以#!python开头,它告诉计算机想让Python来执行这个程序.(我没带这行试了试,也可以,可能这是一种规范吧) 1.从sys.argv读取命令行参数:打开一个新的文

python爬虫爬取网页数据并解析数据

1.网络爬虫的基本概念 网络爬虫(又称网络蜘蛛,机器人),就是模拟客户端发送网络请求,接收请求响应,一种按照一定的规则,自动地抓取互联网信息的程序. 只要浏览器能够做的事情,原则上,爬虫都能够做到. 2.网络爬虫的功能 网络爬虫可以代替手工做很多事情,比如可以用于做搜索引擎,也可以爬取网站上面的图片,比如有些朋友将某些网站上的图片全部爬取下来,集中进行浏览,同时,网络爬虫也可以用于金融投资领域,比如可以自动爬取一些金融信息,并进行投资分析等. 有时,我们比较喜欢的新闻网站可能有几个,每次都要分别

浅谈Python爬取网页的编码处理

背景 中秋的时候,一个朋友给我发了一封邮件,说他在爬链家的时候,发现网页返回的代码都是乱码,让我帮他参谋参谋(中秋加班,真是敬业= =!),其实这个问题我很早就遇到过,之前在爬小说的时候稍微看了一下,不过没当回事,其实这个问题就是对编码的理解不到位导致的. 问题 很普通的一个爬虫代码,代码是这样的: # ecoding=utf-8 import re import requests import sys reload(sys) sys.setdefaultencoding('utf8') url

Python爬取网页信息的示例

Python爬取网页信息的步骤 以爬取英文名字网站(https://nameberry.com/)中每个名字的评论内容,包括英文名,用户名,评论的时间和评论的内容为例. 1.确认网址 在浏览器中输入初始网址,逐层查找链接,直到找到需要获取的内容. 在打开的界面中,点击鼠标右键,在弹出的对话框中,选择"检查",则在界面会显示该网页的源代码,在具体内容处点击查找,可以定位到需要查找的内容的源码. 注意:代码显示的方式与浏览器有关,有些浏览器不支持显示源代码功能(360浏览器,谷歌浏览器,火

python 爬取疫情数据的源码

疫情数据 程序源码 // An highlighted block import requests import json class epidemic_data(): def __init__(self, province): self.url = url self.header = header self.text = {} self.province = province # self.r=None def down_page(self): r = requests.get(url=url

使用python爬取微博数据打造一颗“心”

前言 一年一度的虐狗节终于过去了,朋友圈各种晒,晒自拍,晒娃,晒美食,秀恩爱的.程序员在晒什么,程序员在加班.但是礼物还是少不了的,送什么好?作为程序员,我准备了一份特别的礼物,用以往发的微博数据打造一颗"爱心",我想她一定会感动得哭了吧.哈哈 准备工作 有了想法之后就开始行动了,自然最先想到的就是用 Python 了,大体思路就是把微博数据爬下来,数据经过清洗加工后再进行分词处理,处理后的数据交给词云工具,配合科学计算工具和绘图工具制作成图像出来,涉及到的工具包有: requests

利用Python爬取微博数据生成词云图片实例代码

前言 在很早之前写过一篇怎么利用微博数据制作词云图片出来,之前的写得不完整,而且只能使用自己的数据,现在重新整理了一下,任何的微博数据都可以制作出来,一年一度的虐汪节,是继续蹲在角落默默吃狗粮还是主动出击告别单身汪加入散狗粮的行列就看你啦,七夕送什么才有心意,程序猿可以试试用一种特别的方式来表达你对女神的心意.有一个创意是把她过往发的微博整理后用词云展示出来.本文教你怎么用Python快速创建出有心意词云,即使是Python小白也能分分钟做出来.下面话不多说了,来一起看看详细的介绍吧. 准备工作

Python爬取网页中的图片(搜狗图片)详解

前言 最近几天,研究了一下一直很好奇的爬虫算法.这里写一下最近几天的点点心得.下面进入正文: 你可能需要的工作环境: Python 3.6官网下载 本地下载 我们这里以sogou作为爬取的对象. 首先我们进入搜狗图片http://pic.sogou.com/,进入壁纸分类(当然只是个例子Q_Q),因为如果需要爬取某网站资料,那么就要初步的了解它- 进去后就是这个啦,然后F12进入开发人员选项,笔者用的是Chrome. 右键图片>>检查 发现我们需要的图片src是在img标签下的,于是先试着用