[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Non trivial way of using DAG



Hi Cole,Â

Thank you for your time!

I will try to answer your questions shortly.

  1. What is the upper limit on the number of times this workflow can cycle? I assume it's a large number, but there must be some number of cycles you expect this work to be finished.
Well, what happens basically is a reduction of the dataset based on some calculation, so let's suppose that in the first subset we have n1 elements and after a cycle we end up with n2 elements, with n2<n1. So if this happens for every subset, the D2 (which is the dataset obtained by merging all the subsets) should have a number of elements smaller than D1. We want to stop cycling when we cannot reduce the number of data further. The upper limit of this cycle I think then is reasonably the number of data in the whole dataset.

  1. Are all N data subset computational jobs identical except for minor variations (different input/output files, arguments, etc)?

Yes, they are!

  1. How important is it for you to know which iteration of the cycle is on?

Actually not much. But I should think about that a little deeper.

Thank you for your time, for every other question/material/curiosity feel free to ask me!

Greetings,Â

Lorenzo

Il giorno mer 25 ott 2023 alle ore 22:05 Cole Bollig via HTCondor-users <htcondor-users@xxxxxxxxxxx> ha scritto:
Hi Lorenzo,

This is a very interesting workflow to automate. I do have some questions regarding this:
  1. What is the upper limit on the number of times this workflow can cycle? I assume it's a large number, but there must be some number of cycles you expect this work to be finished.
  2. Are all N data subset computational jobs identical except for minor variations (different input/output files, arguments, etc)?
  3. How important is it for you to know which iteration of the cycle is on?
-Cole Bollig

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Lorenzo Mobilia <l.mobilia@xxxxxxxxxxxxxxxx>
Sent: Wednesday, October 25, 2023 2:45 AM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Non trivial way of using DAG
Â
Hi,Â

I am finding some difficulties in using DAG. Basically I need toÂ

1. Take a dataset D1
2. Split it in N subdataset
3. Perform some computation in these N subdataset
4. Merge these subdataset in another dataset D2
5. Restart the process (back to point 1) now using D2

And continuing until specific characteristics have been achieved by the final dataset. The problems are:

A. I don't know a priori how many times I need to split D1
B. I don't know a priori how many times I need to perform this cycle

The solution I came up with is to build a main which controls this flow, but afterÂsome cycles it crashes.Â

If anyone has some suggestions or is interested in this problem in order to have some other information,Âplease let me know!

Hi,Â

Lorenzo


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/