刘凡 9ff4d1d109 add S3,archive,truncate | 2 years ago | |
---|---|---|
.. | ||
.gitignore | 2 years ago | |
CODE_OF_CONDUCT.md | 2 years ago | |
CONTRIBUTING.md | 2 years ago | |
LICENSE | 2 years ago | |
README.md | 2 years ago | |
requirements.yml | 2 years ago | |
setup.py | 2 years ago | |
thingy_compress.sh | 2 years ago | |
thingy_grabber.py | 2 years ago |
Script for archiving thingiverse things.
usage: thingy_grabber.py [-h] [-l {debug,info,warning}] [-d DIRECTORY] [-f LOG_FILE] [-q] [-c] [-a API_KEY]
{collection,thing,user,batch,version} ...
positional arguments:
{collection,thing,user,batch,version}
Type of thing to download
collection Download one or more entire collection(s)
thing Download a single thing.
user Download all things by one or more users
batch Perform multiple actions written in a text file
version Show the current version
optional arguments:
-h, --help show this help message and exit
-l {debug,info,warning}, --log-level {debug,info,warning}
level of logging desired
-d DIRECTORY, --directory DIRECTORY
Target directory to download into
-f LOG_FILE, --log-file LOG_FILE
Place to log debug information to
-q, --quick Assume date ordering on posts
-c, --compress Compress files
-a API_KEY, --api-key API_KEY
API key for thingiverse
Thingy_grabber v0.10.0 accesses thingiverse in a substantially different way to before. The plus side is it should be more reliable, possibly faster and no longer needs selenium or a firefox instance (and so drastically reduces memory overhead). The downside is you are going to have to do something to continue using the app - basically get yourself an API KEY.
To do this, go to https://www.thingiverse.com/apps/create and create your own selecting Desktop app.
Once you have your key, either specify it on the command line or put it in a text file called api.key
whereever you are running the script from - the script will auto load it.
Because API keys can (are?) rate limited.
The latest version can be downloaded from here: https://github.com/cwoac/thingy_grabber/releases/. Under the 'assets' triangle there is precompiled binaries for windows (no python needed!).
First download the code. Either grab the source, or get the windows binary from above and extract it somewhere. If you are running from source, see requirements.yaml
for the packages you need. You will also need an API key (as above) and to make a directory to store your downloads in.
oh, and you need to know what you want to download, ofc. It can be either things, collections or just the designs of a user. once you have done all this you need to open a command prompt and run it.
Let's say you are running windows and using the precompiled binary and extracted the release to the thingy_grabber
directory on your desktop and you made a things
directory in your Documents
directory.
When you open the command window, it will start in your home directory (say c:\Users\cwoac
)
cd Desktop\thingy_grabber
to get to c:\Users\cwoac\Desktop\thingy_grabber
and check that you are right by trying to run thingy_grabber
- you should get a long list of possible command line options that looks a lot like the list further up.
Supposing you want to download all of my stuff (for some crazy reason), then the command will look like this
thingy_grabber -a YOURAPIKEY -d "c:\Users\cwoac\Documents\things" -c user cwoac
The -c
will cause the script to compress the download to a 7z file to save space. If you prefer to leave it uncompressed, just omit the -c
That's the basics. Well, acutally, there isn't much more than that to be honest. There is a batch mode so if you create a text file with a list of lines like
user cwoac
user solutionlesn
collection cwoac at2018
then you can use the batch
target to run each of these in turn. If you run it a second time with the same options it will only download things which have changed or been added.
thingy_grabber.py thing thingid1 thingid2 ...
This will create a directory named after the title of the thing(s) with the given ID(s) and download the files into it.
thingy_grabber.py collection user_name collection_name1 collection_name2
Where user_name
is the name of the creator of the collection (not nes. your name!) and collection_name1...etc
are the name(s) of the collection(s) you want.
This will create a series of directorys user-collection/thing-name
for each thing in the collection.
If for some reason a download fails, it will get moved sideways to thing-name-failed
- this way if you rerun it, it will only reattmpt any failed things.
thingy_grabber.py user user_name1, user_name2..
Where user_name1..
are the names of creator.
This will create a series of directories user designs/thing-name
for each thing that user has designed.
If for some reason a download fails, it will get moved sideways to thing-name-failed
- this way if you rerun it, it will only reattmpt any failed things.
thingy_grabber.py batch batch_file
This will load a given text file and parse it as a series of calls to this script. The script should be of the form command arg1 ...
.
Be warned that there is currently NO validation that you have given a correct set of commands!
An example:
thing 3670144
collection cwoac bike
user cwoac
If you are using linux, you can just add an appropriate call to the crontab. If you are using windows, it's a bit more of a faff, but at least according to https://www.technipages.com/scheduled-task-windows, you should be able to with a command something like this (this is not tested!): schtasks /create /tn thingy_grabber /tr "c:\path\to\thingy_grabber.py -d c:\path\to\output\directory batch c:\path\to\batchfile.txt" /sc weekly /d wed /st 13:00:00
You may have to play with the quotation marks to make that work though.
All modes now support 'quick mode' (-q
), although this has no effect for individual item downloads. As thingyverse sorts it's returned items in descending last modified order (I believe), once we have determined that we have the most recent version of a given thing in a collection, we can safely stop processing that collection as we should have all the remaining items in it already. This substantially speeds up the process of keeping big collections up to date and will noticably reduce the server load it generates.
Warning: As it stops as soon as it finds an uptodate successful model, if you have unfixed failed downloads further down the list (for want of a better term), they will not be retried.
Warning: At the moment I have not conclusively proven to myself that the result is ordered by last updated and not upload time. Once I have verified this, I will probably be making this the default option.
thingy_grabber.py collection cwoac bike
Download the collection 'bike' by the user 'cwoac'
thingy_grabber.py -d downloads -l warning thing 1234 4321 1232
Download the three things 1234, 4321 and 1232 into the directory downloads. Only give warnings.
thingy_grabber.py -d c:\downloads -l debug user jim bob
Download all designs by jim and bob into directories under c:\downloads
, give lots of debug messages
`
python3, requests, py7xr (>=0.8.2)
name_timestamp
where timestamp
is the last upload time of the old files. The code will then copy unchanged files across and download any new ones.-d
to specify base download directory