如何从Java登录和下载https网页中的文件?

问题描述:

我必须登录https网页并使用Java下载文件。
我事先知道所有网址:

I have to login into a https web page and download a file using Java. I know all the URLs beforehand:

baseURL = // a https URL;
urlMap = new HashMap<String, URL>();
urlMap.put("login", new URL(baseURL, "exec.asp?login=username&pass=XPTO"));
urlMap.put("logout", new URL(baseURL, "exec.asp?exec.asp?page=999"));
urlMap.put("file", new URL(baseURL, "exec.asp?file=111"));

如果我在像Firefox这样的网络浏览器中尝试所有这些链接,它们都能正常工作。

If I try all these links in a web browser like firefox, they work.

现在我这样做:

urlConnection = urlMap.get("login").openConnection();
urlConnection.connect();
BufferedReader in = new BufferedReader(
    new InputStreamReader(urlConnection.getInputStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
    System.out.println(inputLine);
in.close();

我再次回到登录页面HTML,我无法继续下载文件。

I just get back the login page HTML again, and I cannot proceed to file download.

谢谢!

我同意Alnitak认为问题可能存在和返回cookie。

I agree with Alnitak that the problem is likely storing and returning cookies.

我使用的另一个好选择是 HttpClient

Another good option I have used is HttpClient from Jakarta Commons.

值得注意的是,如果这是你控制的服务器,你应该知道发送作为查询字符串的用户名和密码不安全(即使您使用的是HTTPS)。 HttpClient支持使用POST发送参数,您应该考虑这些参数。

It's worth noting, as an aside, that if this is a server you control, you should be aware that sending the username and password as querystrings is not secure (even if you're using HTTPS). HttpClient supports sending parameters using POST, which you should consider.