All the tutorials tend to end at:
predictor.predict({"input": "YOUR_TEXT_GOES_HERE"})
It’s great that the notebooks deliver you to inference, but I have no idea how to hit this endpoint outside of the context of a Jupyter Notebook. I basically have Amazon AWS Java sdk code that does this:
AmazonSageMakerRuntime runtime = AmazonSageMakerRuntimeClientBuilder.defaultClient();
String body = "{\"instances\": [{\"data\": { \"input\": \"Hello World\"}}]}";
ByteBuffer bodyBuffer = ByteBuffer.wrap(body.getBytes());
InvokeEndpointRequest request = new InvokeEndpointRequest()
.withEndpointName("huggingface-pytorch-training-....")
.withBody(bodyBuffer);
InvokeEndpointResult invokeEndpointResult = runtime.invokeEndpoint(request);
Unfortunately, I get an error:
{
"code": 400,
"type": "InternalServerException",
"message": "Content type is not supported by this framework.\n\n Please implement input_fn to to deserialize the request data or an output_fn to\n serialize the response. For more information, see the SageMaker Python SDK README."
}
Am I missing something?