multithreading - Python Socket and Thread pooling, how to get more performance? -



multithreading - Python Socket and Thread pooling, how to get more performance? -

i trying implement basic lib issue http get requests. target receive info through socket connections - minimalistic design improve performance - usage threads, thread pool(s).

i have bunch of links grouping hostnames, here's simple demonstration of input urls:

hostname1.com - 500 links hostname2.org - 350 links hostname3.co.uk - 100 links ...

i intend utilize sockets because of performance issues. intend utilize number of sockets keeps connected (if possible , is) , issue http requests. thought came urllib low performance on continuous requests, met urllib3, realized uses httplib , decided seek sockets. here's accomplished till now:

getsocket class, socketpool class, threadpool , worker classes

getsocket class minified, "http only" version of python's httplib.

so, utilize these classes that:

sp = comm.socketpool(host,size=self.poolsize, timeout=5) link in linklist: pool.add_task(self.__get_url_by_sp, self.count, sp, link, results) self.count += 1 pool.wait_completion() pass

__get_url_by_sp function wrapper calls sp.urlopen , saves result results list. using pool of 5 threads has socket pool of 5 getsocket classes.

what wonder is, there other possible way can improve performance of system?

i've read asyncore here, couldn't figure out how utilize same socket connection class httpclient(asyncore.dispatcher) provided.

another point, don't know if i'm using blocking or non-blocking socket, improve performance or how implement one.

please specific experiences, i don't intend import library http want code own tiny library.

any help appreciated, thanks.

do this.

use multiprocessing. http://docs.python.org/library/multiprocessing.html.

write worker process puts of url's queue.

write worker process gets url queue , get, saving file , putting file info queue. you'll want multiple copies of process. you'll have experiment find how many right number.

write worker process reads file info queue , whatever you're trying do.

python multithreading sockets threadpool http-get

Comments

Popular posts from this blog

iphone - Dismissing a UIAlertView -

intellij idea - Update external libraries with intelij and java -

javascript - send data from a new window to previous window in php -