The abstractions are relatively brittle. If you don't have a powerful GPU, you will be forced to consider how to split the model between CPU and GPU, how much context size you need, whether to quantize the model, and the tradeoffs implied by these things. To understand these, you have to develop a basic model how an LLM works.
By interacting with it. You see the contours of its capabilities much more clearly, learn to recognize failure modes, understand how prior conversation can set the course of future conversation in a way that's almost impossible to correct without starting over or editing the conversation history.