Dosya indirme web Python 3'te

SORU

30 AĞUSTOS 2011, Salı

Dosya indirme web Python 3'te

Bir indirecek bir program yaratıyorum .jar (java) belirtilen URL okuyarak bir web sunucusundan dosya .aynı oyun/uygulama jad dosyası. Python 3.2.1 kullanıyorum

Başardım ayıklamak için URL JAR dosyasından JAD dosyası (her JAD dosyası içeren URL JAR dosyası), ama olabilir hayal, çıkarılan değer türü (dize.

İşte ilgili fonksiyon:

def downloadFile(URL=None):
    import httplib2
    h = httplib2.Http(".cache")
    resp, content = h.request(URL, "GET")
    return content

downloadFile(URL_from_file)

Ancak ben her zaman bir hata yukarıdaki fonksiyon türü bayt, ve dize olması gerektiğini söylüyor. URL kullanarak denedim.('utf-8, ve de') bayt (,=kodlama URL'utf-8'), ama her zaman aynı veya benzer bir hata alırdım. kodlamak

Yani aslında benim sorum bu URL bir dize yazın saklandığında bir sunucudan dosya indirmek için nasıl?

CEVAP

30 AĞUSTOS 2011, Salı

Eğer bir değişken içine bir web sayfasının içeriğini almak istiyorsanız, sadece read urllib.request.urlopen cevabı:

import urllib.request
...
url = 'http://example.com/'
response = urllib.request.urlopen(url)
data = response.read()      # a `bytes` object
text = data.decode('utf-8') # a `str`; this step can't be used if data is binary

İndirin ve bir dosyaya kaydetmek için en kolay yolu urllib.request.urlretrieve Bu fonksiyonu kullanmak için:

import urllib.request
...
# Download the file from `url` and save it locally under `file_name`:
urllib.request.urlretrieve(url, file_name)

import urllib.request
...
# Download the file from `url`, save it in a temporary directory and get the
# path to it (e.g. '/tmp/tmpb48zma.txt') in the `file_name` variable:
file_name, headers = urllib.request.urlretrieve(url)

Ama urlretrieve legacy kabul ve kaldırılmış olabilir unutmayın (neden değil).

En çokdoğrubu bir HTTP yanıtı temsil eder ve gerçek bir dosya shutil.copyfileobj kullanarak kopyalayın bu dosya gibi bir nesneyi döndürmek için urllib.request.urlopen Bu fonksiyonu kullanmak olacaktır.

import urllib.request
import shutil
...
# Download the file from `url` and save it locally under `file_name`:
with urllib.request.urlopen(url) as response, open(file_name, 'wb') as out_file:
    shutil.copyfileobj(response, out_file)

Eğer bu çok karışık görünüyorsa, daha basit gidip bytes bir nesne bütün yükleme deposu ve bir dosyaya yazmak isteyebilirsiniz. Ama bu küçük dosyalar için çalışır.

import urllib.request
...
# Download the file from `url` and save it locally under `file_name`:
with urllib.request.urlopen(url) as response, open(file_name, 'wb') as out_file:
    data = response.read() # a `bytes` object
    out_file.write(data)

Olası sinek .gz (ve belki de diğer formatlar) sıkıştırılmış veri ayıklamak için, ama böyle bir operasyon muhtemelen dosya için rasgele erişim desteği için HTTP Sunucusu gerektirir.

import urllib.request
import gzip
...
# Read the first 64 bytes of the file inside the .gz archive located at `url`
url = 'http://example.com/something.gz'
with urllib.request.urlopen(url) as response:
    with gzip.GzipFile(fileobj=response) as uncompressed:
        file_header = uncompressed.read(64) # a `bytes` object
        # Or do anything shown above using `uncompressed` instead of `response`.