I need to give a background as well so my reply will be a little long. lol
1) Here we are talking about a long term memory (LTM) i.e. information that is encoded, stored and can be retrieved even after a long time. LTMs are divided into declarative (consciously recalled) and non declarative (un-consciously used). Declarative is further divided into 1) episodic (events) and 2) semantic (facts, knowledge). The movie Sholay is a episodic declarative LTM. It is a past episode in your life which was important enough to move from short term memory to a long term one and therefore encoded and stored. It was stored as established networks (Engrams) which are either dedicated neurocircuits (meaning separate one for an episode) or a pattern of involvement of neurocircuitry (different firing pattern for different event). As I said before I am not clear on that. The network has stronger synaptic connections.
The episodic and semantic (declarative LTM) are encoded and stored largely by medial temporal lobe structures such as hippocampus, different cortices in temporal lobe etc. We should also note that it is not like they are the only structure involved in this process. Besides sensory processing areas, frontal lobe is also involved in the sense that it is responsible for attention towards an event which is in turn one of the factor that is involved in decision of encoding a short term memory to LTM and also in level/quality of encodining.
2) Encoding sholay would mean putting together or integration of different sensory stimuli (coming from their respective sensory cortical areas/lobes located all over the brain largely lateral cortex) and it is more than just sound and visual coming from the movie. It involves the context, the surrounding, the time and many other aspects. For example you may have thought of a western cowboy movie watching the initial scene when the train arrives and the inspector heads for Thakur's home. That thought is also encoded. The most important region is Hippocampus. All the input goes to this region of medial temporal lobe.
3) All this happened 5 years ago. Now you were going somewhere and suddenly you saw a person looking like Gabbar Singh. That is a cue. The information will flow from visual cortex towards medial lobe. Frontal lobe here also plays an important in conscious retrieval of stored memories (you wonder who he resembles) and directs it to temporal lobe. Once it is there it reactivates that network/patten/Engram (similar to one that resembles Gabbar Singh) which in turn stimulates the whole pathway in reverse order i.e. involves all sensory cortices which were responsible for stimuli and there you go you are able to recall Gababr singh and related memories lol. It is kind of like that. By watching a person you concisously asked a question and it was directed to temporal lobe which retrived a memory closest to the input.
Note the reason we do not see Gabbar singh or hear him is that we are talking about activation of sensory/stimuli processing area and not going from those areas to respective sensory organs. That I believe is a one-way traffic. This is why you imagine it in your mind but it does not replace what you are seeing, hearing it at that time and hence it is not similar to experiencing a true event.
Though we have the capacity to store a lot still we cannot recall a lot and that to at once or in a order. The reason is our brain is not like a DVD which exactly encodes what you write on it. There are lots of things going on majority of which we filter out. It is the one we pay attention to that is encoded and that is not alone. It stores relatively (episodic memory i.e.) and therefore we only store and remember things relevant and improtant to us.
In conclusion above are the regions involved. Initially the experience was living the event (sensory organs + sensory areas processing information) while when you retrieve it involves medial temporal lobe and sensory processing areas recreating a memory. I guess it is these two that largely experience it on retrieval.