刘凡 3282579ec1 first commit | há 2 anos atrás | |
---|---|---|
.. | ||
.gitignore | há 2 anos atrás | |
README.md | há 2 anos atrás | |
azure_storage_helpers.py | há 2 anos atrás | |
get_predict.py | há 2 anos atrás | |
model_helpers.py | há 2 anos atrás | |
model_train.py | há 2 anos atrás | |
text_processing.py | há 2 anos atrás |
This sample shows how to:
The modules model_helpers, text_processing and azure_storage_helpers in this repo can be used independently.
Install the following libraries used if you don't already have them installed by running pip install {name of library}
:
Make sure you have a CosmosDB database set up with data in it
Dataset is to come, but the expected format in this sample is:
{...
pages:[{
...
sections:[{
...
text:''
label:''
},
...
]
},
...
]
}
You can of course use your own data, in this case make some changes to the dataset building in model_train.py
Make sure you have an Azure Storage account set up
In model_train.py, replace the values in cosmosConfig and blobConfig with your own
Run python .\model_train.py
In get_predict.py, replace the values in blobConfig with your own
In get_predict.py, replace the text to get a prediction for
Run python .\get_predict.py
If you need to upload/retrieve files from CosmosDB or Azure Blob Storage in python, azure_storage_helpers.py is all you need
If you want to perform text processing (normalize text and remove stop words), you can use text_processing.py
If you want to train a classification model using sklearn, model_helpers is what you're looking for