Java interview questions
Language
- What are the differences between C++ and Java?
C++ is not platform-independent; the principle behind C++ programming is “write once, compile anywhere.”
In contrast, because the byte code generated by the Java compiler is platform-independent, it can run on any machine, Java programs are written once and run everywhere.
- Explain JVM, JRE, and JDK.
| JDK | JRE |
|---|---|
| JavaDevelopment Kit | Java Runtime Environment |
| JDK is a dedicated kit for solely software development | JRE is a set of software and library designed for executing Java Programs |
| Unlike JVM, JDK is Platform Dependent | Unlike JVM, JRE is also Platform Dependent |
| JDK package is a set of tools for debugging and Developing | JRE Package is one that only supports files and libraries for a runtime environment |
| JDK package will be provided with an installer file | JRE Package does not get an installer but has only a runtime environment |
- What is a ClassLoader?
A classloader in Java is a subsystem of Java Virtual Machine, dedicated to loading class files when a program is executed; ClassLoader is the first to load the executable file.
Java has Bootstrap, Extension, and Application classloaders.
- What are the Memory Allocations available in JavaJava?
Java has five significant types of memory allocations.
Class Memory Heap Memory Stack Memory Program Counter-Memory Native Method Stack Memory
- What are the differences between Heap and Stack Memory in Java?
Stack memory in data structures is the amount of memory allocated to each individual programme. It is a fixed memory space. Heap memory, in contrast, is the portion that was not assigned to the Java code but will be available for use by the Java code when it is required, which is generally during the program's runtime.
- Explain Java String Pool.
A collection of strings in Java's Heap memory is referred to as Java String Pool. In case you try to create a new string object, JVM first checks for the presence of the object in the pool. If available, the same object reference is shared with the variable, else a new object is created.
- What are the differences between Heap and Stack Memory in Java?
Stack memory in data structures is the amount of memory allocated to each individual programme. It is a fixed memory space. Heap memory, in contrast, is the portion that was not assigned to the Java code but will be available for use by the Java code when it is required, which is generally during the program's runtime.
- Which among String or String Buffer should be preferred when there are a lot of updates required to be done in the data?
Because StringBuilder is quicker than StringBuffer, it is advised to utilize it wherever possible. However, StringBuffer objects are the best choice if thread safety is required.
- Can you explain the Java thread lifecycle?
A thread can be in any of the following states in Java. These are the states:
New: A new thread is always in the new state when it is first formed. The function hasn't been run yet, thus it hasn't started to execute for a thread in the new state. Active: A thread switches from the new state to the active state when it calls the start() method. The runnable state and the running state are both contained within the active state. Blocked or Waiting: A thread is either in the blocked state or the waiting state when it is inactive for a while (but not indefinitely). Timed waiting: When we use the sleep () method on a particular thread, we are actually engaging in timed waiting. The thread enters the timed wait state using the sleep () function. The thread awakens when the allotted time has passed and resumes execution where it left off. Termination: A thread that has been terminated means it is no longer active in the system. In other words, the thread is inactive and cannot be revived (made active again after being killed).
- What is a Memory Leak? Discuss some common causes of it.
A memory leak is the slow degradation of system performance over time brought on by the fragmentation of a computer's RAM as a result of shoddy application design or programming that fails to release memory chunks when they are no longer required. These memory leaks frequently result from session items in excess, insertion into Collection objects without deletion, infinite caches, excessive page switching on the operating system, listener methods that are not called, and bespoke data structures that are poorly written.
Redis / RabbitMQ persistence
- What persistence options does Redis provide?
Redis offers two main mechanisms: RDB (Redis Database snapshots) and AOF (Append-Only File).
RDB takes point-in-time binary snapshots of the dataset and writes them to disk. Snapshots are triggered by SAVE (blocking) or BGSAVE (which forks a child process that uses copy-on-write so the parent keeps serving traffic). RDB files are compact and restart recovery is fast, but any writes between the crash and the last snapshot are lost.
AOF logs every write command to a file and replays it on startup. The appendfsync policy can be always (safest, slowest), everysec (default, at most ~1 second of data loss) or no (rely on the OS). The AOF file is rewritten periodically to compact it by replaying the current dataset.
- What is hybrid persistence (Redis 4+)?
Since Redis 4, AOF rewrite can dump the current dataset as an RDB-format prefix and then append incremental commands after it. You get RDB's fast recovery together with AOF's low data-loss guarantee. In production this is usually the recommended setup.
- Any operational pitfalls around persistence?
Both BGSAVE and AOF rewrite call fork(). On large instances (tens of GB) the page-table copy can stall the process for hundreds of milliseconds. Disabling Transparent Huge Pages (THP) and leaving enough free memory for copy-on-write are standard mitigations.
- How do you make a message survive a RabbitMQ restart?
You need three things together: a durable exchange, a durable queue, and a persistent message (deliveryMode = 2). If any of the three is missing, the message is lost on broker restart.
boolean durable = true;
channel.exchangeDeclare("my_exchange", "direct", durable);
channel.queueDeclare("my_queue", durable, false, false, null);
channel.basicPublish("my_exchange", "my_key",
MessageProperties.PERSISTENT_TEXT_PLAIN,
payload);
- Does persistence alone guarantee "no message loss"?
No. Persistence only protects against broker restarts. To cover producer and consumer failures you also need publisher confirms (channel.confirmSelect() plus confirm listeners) on the producer side and manual consumer acknowledgements (basicAck after the message is fully processed, basicNack/basicReject on failure) on the consumer side. For strict end-to-end reliability, combine persistence, publisher confirms, manual acks, and mirrored/quorum queues.
Nginx / Docker / Elasticsearch
- What is Nginx and why is it commonly paired with Java backends?
Nginx is a high-performance web server and reverse proxy built on an event-driven, asynchronous, non-blocking model. A single worker can handle thousands of concurrent connections with low memory usage, which is why it is typically placed in front of Java application servers (Tomcat, Spring Boot) to terminate TLS, serve static content, load-balance across instances, and buffer slow clients.
- How is an Nginx configuration structured?
Configuration is organized into nested contexts: a main context (worker processes, user), an events block (connection model), an http block (global HTTP settings), one or more server blocks (virtual hosts, listen, server_name), and location blocks inside them (routing rules, proxy_pass, rewrite).
- How does Nginx reload configuration without downtime?
nginx -s reload tells the master to parse the new configuration, start new workers, and gracefully stop old ones once their in-flight connections finish. Existing requests are not dropped.
- What is Docker and how does it differ from a virtual machine?
Docker packages an application together with its dependencies into an image and runs it as a container, which is an isolated Linux process that uses kernel features like namespaces (PID, network, mount, user) for isolation and cgroups for resource limits. Unlike a VM, a container does not ship a guest kernel, so startup is near-instant and overhead is much lower. VMs give stronger isolation; containers give density and speed.
- How would you containerize a Spring Boot application?
Typical pattern is a multi-stage Dockerfile: one stage with a JDK and Maven/Gradle builds the fat JAR, a second stage copies just the JAR onto a slim JRE base image. This keeps the final image small and free of build tools.
FROM maven:3.9-eclipse-temurin-21 AS build
WORKDIR /app
COPY pom.xml .
RUN mvn -B dependency:go-offline
COPY src ./src
RUN mvn -B package -DskipTests
FROM eclipse-temurin:21-jre
WORKDIR /app
COPY --from=build /app/target/*.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "/app/app.jar"]
- What is Elasticsearch and what problem does it solve?
Elasticsearch is a distributed, RESTful search and analytics engine built on Apache Lucene. It stores documents as JSON and builds an inverted index (term → list of document IDs) which makes full-text queries orders of magnitude faster than a LIKE scan in a relational database. It is the "E" in the ELK stack (Elasticsearch, Logstash, Kibana) that is widely used for centralized logging and observability.
- What are the core Elasticsearch concepts?
Cluster (one or more nodes), node, index (logical collection of documents, similar to a database), document (a JSON record), shard (a Lucene index; an index is split into primary shards and each has replicas for HA), and mapping (the schema for fields). Elasticsearch listens on port 9200 for the REST API and 9300 for inter-node transport.
MySQL, transactions (Atomicity, Consistency, Isolation, Durability)
- What does ACID mean?
Atomicity – all statements in a transaction succeed together, or none of them take effect. Consistency – each transaction moves the database from one valid state to another, respecting constraints. Isolation – concurrent transactions behave as if executed serially, with anomalies controlled by the isolation level. Durability – once a transaction commits, its effects survive crashes and restarts (in InnoDB this is enforced by the redo log and fsync).
What transaction control statements does MySQL support?
START TRANSACTION/BEGIN– open a new transaction.SAVEPOINT name– mark a point you can partially roll back to.ROLLBACK TO SAVEPOINT name– undo work done after that savepoint but keep the transaction open.ROLLBACK– undo the whole transaction.COMMIT– persist all changes.What are the four SQL isolation levels and what anomalies do they allow?
| Level | Dirty read | Non-repeatable read | Phantom read |
|---|---|---|---|
| Read Uncommitted | possible | possible | possible |
| Read Committed | prevented | possible | possible |
| Repeatable Read (InnoDB default) | prevented | prevented | prevented in InnoDB via MVCC + gap locks |
| Serializable | prevented | prevented | prevented |
A dirty read sees uncommitted data from another transaction. A non-repeatable read means the same row returns different values in two reads of the same transaction. A phantom read means a range query returns a different number of rows when repeated. Standard SQL allows phantoms at Repeatable Read; InnoDB blocks them for snapshot reads via MVCC and for locking reads via gap/next-key locks.
- How does InnoDB implement isolation with MVCC?
Each row carries two hidden columns: trx_id (the transaction that last modified it) and roll_pointer (a pointer into the undo log). When a snapshot read runs, InnoDB creates a ReadView containing the currently active transaction IDs. A row version is visible if its trx_id was committed before the ReadView was taken; otherwise InnoDB walks the undo chain through roll_pointer to find an older, visible version. Readers never block writers and vice versa, which is why MVCC scales much better than pure locking.
Unit test
- What is unit testing?
Unit testing verifies the smallest testable pieces of code (a method or a class) in isolation from their dependencies, typically by the developer before integration tests. External collaborators (databases, HTTP clients, queues) are replaced with test doubles so failures pinpoint the unit under test.
- What is JUnit and what annotations do you use most?
JUnit is the de-facto Java unit-test framework. Core JUnit 5 annotations: @Test marks a test method; @BeforeEach / @AfterEach run before/after every test; @BeforeAll / @AfterAll run once per class (and must be static in JUnit 5 unless the class is PER_CLASS lifecycle); @DisplayName and @ParameterizedTest improve readability and coverage.
- What is Mockito and when do you use it?
Mockito is a mocking framework built on the Java Reflection API. You use it when the unit under test depends on a collaborator whose real implementation is slow, non-deterministic, or has side effects (database, REST client, clock). @Mock creates a mock, @InjectMocks wires mocks into the class under test, when(x.foo()).thenReturn(y) stubs behavior, and verify(x).foo() asserts interactions.
@ExtendWith(MockitoExtension.class)
class OrderServiceTest {
@Mock PaymentClient payment;
@InjectMocks OrderService service;
@Test
void chargesCustomerOnce() {
when(payment.charge(anyString(), eq(100))).thenReturn("OK");
service.placeOrder("u1", 100);
verify(payment, times(1)).charge("u1", 100);
}
}
- What do you consider a "good" unit test?
Fast (milliseconds), deterministic (no sleeps, no network), isolated (one failure reason), readable (Arrange / Act / Assert), and focused on behavior rather than implementation details so refactoring does not break it.
Project architecture, common middleware, system cache
- How do you layer a typical Java backend project?
Most Spring projects follow a layered architecture: controller / presentation (HTTP entry points, DTO validation), service / business logic (use cases, transactions), repository / data access (JPA, MyBatis), and domain / model (entities, value objects). Each layer depends only on the layer directly beneath it, which keeps the code modular, testable, and easy to evolve.
What middleware commonly sits in a production stack and why?
Nginx – reverse proxy, TLS termination, load balancing, static content.
Redis – distributed cache, session store, rate limiter, distributed lock.
RabbitMQ / Kafka – asynchronous messaging, decoupling services, smoothing traffic spikes.
Elasticsearch – full-text search, log aggregation, analytics.
MySQL / PostgreSQL – primary transactional store.
Prometheus / Grafana / ELK – metrics, dashboards, centralized logging.
How would you design a distributed cache layer for a read-heavy service?
Start with access patterns (read/write ratio, latency target, data size). Use a cache-aside pattern: the service reads from Redis first and on a miss loads from the database and back-fills. For scale, shard keys with consistent hashing so adding or removing a node only moves a small fraction of keys. Pick an eviction policy (LRU, LFU, TTL) based on the workload. Handle the classic failure modes:
- Cache penetration (query for keys that never exist) – cache a negative marker or use a Bloom filter.
- Cache breakdown (hot key expires, stampede hits the DB) – use a mutex / singleflight to rebuild, or never expire the hottest keys.
Cache avalanche (many keys expire at the same time) – add jitter to TTLs, and run Redis in HA with replicas.
How is an LRU cache usually implemented?
With a HashMap + doubly linked list: the map gives O(1) lookup by key, the list keeps entries ordered by recency, and both get and put run in O(1). On eviction, remove the tail node and the matching map entry. In Java, LinkedHashMap with accessOrder = true and an overridden removeEldestEntry gives you this for free.
What lifecycle phases do you split a project into, and what do you focus on in each?
A pragmatic breakdown:
- Requirements – clarify goals, non-functional requirements (SLA, QPS, data volume), and boundaries. The main risk here is ambiguity, so the focus is on written specs and explicit acceptance criteria.
- Design – high-level architecture, module split, data model, API contracts, capacity and failure modes. Focus on trade-offs (consistency vs availability, sync vs async, build vs buy) and on documenting decisions.
- Implementation – coding to the design, with code review, static analysis, and a branching strategy. Focus on readability, adherence to conventions, and keeping pull requests small.
- Testing – unit, integration, contract, and performance tests. Focus on coverage of critical paths, reproducible environments, and regression protection.
- Release / deployment – CI/CD pipeline, blue-green or canary rollouts, database migrations, feature flags. Focus on rollback safety and change risk.
- Operations and monitoring – metrics, logs, traces, alerts, runbooks, on-call. Focus on MTTR, capacity planning, and continuous improvement based on incident postmortems.
- Maintenance / evolution – bug fixes, refactoring, dependency upgrades, deprecations. Focus on keeping the system healthy and the cost of change low.
Page Source